-
Notifications
You must be signed in to change notification settings - Fork 9.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
new-contrib: Audio Whisper API with Local Device Microphones #1271
base: main
Are you sure you want to change the base?
new-contrib: Audio Whisper API with Local Device Microphones #1271
Conversation
My previous PR message before I updated. It's mostly justification for each criteria. Introduction:This contribution introduces a practical guide on using the Whisper API to transcribe audio from a device's microphone. The notebook includes steps to record audio, transcribe it using the Whisper API, and copy the transcription to the clipboard, providing an accessible and useful resource for AI builders. Justification:1. Relevance: 2. Usefulness: 3. Uniqueness: 4. Clarity: 5. Correctness: 6. Conciseness: 7. Completeness: 8. Grammar: |
gently bumping this up. willing to revise and have learned a lot about dealing after submitting my hackathon entry for Gemini API ^_^ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Criteria | Description | Score |
---|---|---|
Relevance | Is the content related to building with OpenAI technologies? Is it useful to others? | 4 |
Uniqueness | Does the content offer new insights or unique information compared to existing documentation? | 4 |
Clarity | Is the language easy to understand? Are things well-explained? Is the title clear? | 4 |
Correctness | Are the facts, code snippets, and examples correct and reliable? Does everything execute correctly? | 2 |
Conciseness | Is the content concise? Are all details necessary? Can it be made shorter? | 4 |
Completeness | Is the content thorough and detailed? Are there things that weren’t explained fully? | 4 |
Grammar | Are there grammatical or spelling errors present? | 4 |
Really solid contribution, thank you! Motivation is clear, steps are broken down well, and the sections make sense. Caught a few mistakes here and there (mostly to do with using the SDK the old way), but once you correct them you're all set to merge!
ChangelogHi @ibigio. Heavily revised my article now that I'm a month wiser. :) Updated Image:
Content Structure:
Code Improvements:
OpenAI API Updates:
Documentation:
Terminology:
Aesthetic Improvements:
|
Hope everything is well, @ibigio. Is there anything else you'd want me to modify for this PR? Also, hope you saw my SWE internship application too. 🤭 |
@gabor-openai @ericning-o @danielin-openai @ray-openai hello folks! Perhaps @ibigio is caught up in the business amidst the good work he's doing for OAI. Gently tagging you guys so you could give the green signal in publishing this should it be satisfactory. Thank you very much for your hardwork! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code seems to run now mostly free of errors, left a couple comments around code clarity and correctness.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unless the reader speaks Filipino they can't test this part out – how about translating from a more common second language like Spanish?
Also, an indefinite record makes many notebooks crash – set a 5-10 second limit as well.
# Demo: Transcribe lengthy Filipino speech and translate into English with proper grammar and punctuation
result = transcribe_audio(
debug=False,
prompt="Filipino spoken. Proper grammar and punctuation. Skip fillers.",
timed_recording=False,
record_seconds=0,
is_english=False,
)
print("\nTranscription/Translation:", result)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Combining transcribing and translating here is a bit weird in this function, and also drops the prompt
param for translations. (The prompt should be in english for translation and language of choice in a transcription). I'd split this out into two clear helper functions for translate and transcribe.
def process_audio(file_name, is_english=True, prompt=""):
with open(file_name, "rb") as audio_file:
if is_english:
response = client.audio.transcriptions.create(
model="whisper-1", file=audio_file, prompt=prompt
)
else:
response = client.audio.translations.create(
model="whisper-1", file=audio_file
)
return response.text.strip()
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't this this is how we intend for the prompt parameter to be used – looking at our docs, it is more of an example(s) than an instruction.
Thanks @ibigio! Will get these fixed within an hour. |
…://github.com/CarlKho-Minerva/openai-cookbook into carl-kho/Whisper_API-device_mic_transcription
Summary
This PR adds a new notebook that demonstrates how to use the Whisper API to transcribe text from your device's microphone. The notebook includes steps to record audio, transcribe it using the Whisper API, and copy the transcription to the clipboard. It aims to provide a practical guide for users who want to integrate speech-to-text functionality into their applications.
*This pull request was written by Chat GPT and reviewed by a human. The article, however, is made by a human.
Motivation
This tutorial was created because the functionality to transcribe speech to text from a microphone is not well-documented. I found the mic speech-to-text option in the ChatGPT apps (not websites) extremely helpful for day-to-day operations and wanted to save others from having to learn about different audio processing modules.
For new content
When contributing new content, read through our contribution guidelines, and mark the following action items as completed: