Transcribe WhatsApp Audio Messages with Whisper AI via Groq

Go to Workflow
0 views
Built by Noriwal AlMa Jr Noriwal AlMa Jr
Created on June 07, 2026

Description

WhatsApp Audio Transcriber Bot

Overview
Automatically transcribe WhatsApp audio messages to text using AI-powered speech recognition. This workflow receives audio messages via webhook, processes them through Groq's Whisper API, and replies with the transcribed text in the same conversation.

Use Cases
Accessibility**: Help users with hearing impairments access audio content
Workplace Communication**: Quickly scan audio messages in professional settings
Language Learning**: Get text versions of audio for better comprehension
Meeting Notes**: Convert voice messages to searchable text format
Multilingual Support**: Transcribe audio in Portuguese (configurable for other languages)

How it Works
Message Reception: Webhook receives WhatsApp messages in real-time
Audio Detection: Filters only audio messages using Switch node
Format Conversion: Converts base64 audio to MP3 file format
AI Transcription: Processes audio through Groq API with Whisper Large V3 model
Response Delivery: Sends transcribed text back to the original conversation

Key Features
✅ Real-time Processing: Instant transcription of incoming audio messages
✅ High Accuracy: Uses Whisper Large V3 model for reliable transcription
✅ Auto-Reply: Automatically responds in the same WhatsApp conversation
✅ Message Quoting: References the original audio message in the reply
✅ Portuguese Optimized: Configured for Brazilian Portuguese transcription
✅ Self-Message Filtering: Ignores messages sent by the bot itself

Prerequisites
Required Services
Evolution API**: WhatsApp integration service
Groq API**: AI transcription service (Whisper model)
n8n Instance**: Workflow automation platform

API Keys & Configuration
Groq API key (set as environment variable: GROQ_API_KEY)
Evolution API instance properly configured
Webhook URL configured in Evolution API

Setup Instructions
Import Workflow: Import the JSON workflow into your n8n instance
Configure Environment: Set GROQ_API_KEY environment variable
Setup Webhook: Configure Evolution API to send messages to the webhook endpoint
Test Connection: Send a test audio message to verify the workflow

Workflow Nodes
Webhook**: Receives WhatsApp messages from Evolution API
Edit Fields**: Extracts relevant data (number, name, message, audio)
Switch**: Filters only audio messages (audioMessage type)
Convert to File**: Transforms base64 audio to MP3 format
HTTP Request**: Sends audio to Groq API for transcription
Evolution API**: Sends transcribed text back to WhatsApp

Configuration Options
Groq API Settings
Model**: whisper-large-v3
Language**: pt (Portuguese)
Temperature**: 0 (maximum accuracy)
Response Format**: json

Customization Options
Change language by modifying the language parameter
Adjust temperature for different accuracy/creativity balance
Modify response format for different output styles

Response Format
Mensagem transcrita automaticamente.
[Transcribed text content]

Technical Specifications
Input**: Base64 encoded audio from WhatsApp
Output**: Plain text transcription
Processing Time**: Typically 2-5 seconds per audio message
Supported Audio**: MP3 format (converted from WhatsApp audio)
Language**: Portuguese (configurable)

Troubleshooting
No Response**: Check Groq API key and webhook configuration
Poor Transcription**: Ensure audio quality and check language settings
Error Messages**: Monitor n8n execution logs for detailed error information

Version History
v0.0.1**: Initial release with basic transcription functionality

Nodes Used (1)

HTTP Request
n8n-nodes-base.httpRequest