-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Audio transcription in supervisor route #19
Conversation
@Luisotee, it's working great, amazing job! Just don't forget to add the packages you use. For example:
Which will automatically add to the A second comment is that the route |
…idem/earth-defenders-assistant into luisotee/audio-transcription
@luandro should be fine now |
@Luisotee when testing on the docs page, "send empty value" works when set for message, but for some reason when setting empty value for message an error is throw: |
@luandro Let me explain the "Send empty value" checkbox: When checked, it will send an empty argument to the API (like The error you encountered happened because you were sending both an empty string and no audio file simultaneously, leaving nothing for the API to process. I've also fixed the dependency issue, so this PR is now ready to be merged. |
@Luisotee, thanks for the explanation, but as you can see on the picture I'm not sending an empty message, it has a string: "hello". There is something that the API should process. |
@luandro The error occurred due to the audio being sent as an empty file rather than being omitted or providing a valid file. The correct usage is:
About the "Send empty value" checkbox - it's just a default Swagger UI feature. We don't actually use it in our use case, and I haven't found a way to remove it from the interface. |
Audio Transcription Support for Supervisor Route
Overview
Add audio transcription capabilities to the supervisor route using Groq's Whisper V3 Turbo API. This centralizes audio processing in the AI API service, eliminating the need for individual transcription handling in client integrations (simulator, WhatsApp, Telegram).
Technical Changes
Dependencies
python-multipart
for form data handlingpython-ffmpeg
for audio conversiongroq
for Whisper API accessConfiguration
Requires
GROQ_API_KEY
environment variableBenefits