Transcribe voice messages and classify intent with OpenAI Whisper and GPT-4o-mini
Go to WorkflowDescription
Quick overview
This workflow downloads an audio file, transcribes it with OpenAI Whisper, classifies the transcript intent using OpenAI GPT-4o-mini, and returns a simple response message based on the detected category.
How it works
Runs when you manually execute the workflow.
Sets a sample audio URL (JFK .flac) and downloads the audio file via an HTTP request.
Sends the audio file to OpenAI Whisper to generate a text transcription.
Passes the transcript to OpenAI GPT-4o-mini to classify it as GREETING, QUESTION, REQUEST, or OTHER.
Normalizes the model output to an uppercase intent value and routes execution based on the intent.
Returns a predefined response message for the matched intent branch.
Setup
Add OpenAI API credentials for both the Whisper transcription step and the GPT-4o-mini intent classification step.
Replace the sample audio URL with your own audio source, or swap the manual trigger for a webhook that provides an audio URL.
If you use a different audio format, ensure the downloaded file is a supported type for OpenAI transcription (and adjust the MIME type/value if you rely on it elsewhere).
Customization
Connect to any WhatsApp gateway — Evolution API, Twilio, or WhatsApp Cloud API
Add custom intent categories to match your business (COMPLAINT, APPOINTMENT, PRICING)
Route each intent to a different workflow — CRM update, human escalation, auto-reply
Swap GPT-4o-mini for Claude Haiku to reduce costs on high-volume deployments
Extend with RAG to give context-aware responses based on your knowledge base
Additional info
This workflow is a simplified extract from a production multi-tenant
WhatsApp AI system handling real customer conversations.
Built with: n8n · OpenAI Whisper · GPT-4o-mini · Evolution API · Docker · Oracle Cloud
Tags: whatsapp, voice, audio, transcription, whisper, intent, classification,
chatbot, ai-agent, automation, openai, gpt4o-mini, customer-support, nlp