Audio less than 1s long silently fails all transcription #1603

isaac-mcfadyen · 2023-12-07T16:28:39Z

Issue

Apologies if this is a known issue! I looked and couldn't find an existing one 😄

When passing audio that is less than 1s long to whisper.cpp, regardless of the model or hardware acceleration (tested with small and medium, on both CPU only and cuBLAS), Whisper silently fails to transcribe the audio.

There's no error returned, it just looks like it skips all sampling (there's 0ms listed for sample, encode, decode, and prompt times) and returns nothing for the transcript.

Reproduction

First, make sure whisper is working by running on JFK sample:

make clean
make -j (with or without cuBLAS)
./main --model ./models/ggml-medium.en.bin bindings/go/samples/jfk.wav
- Working! Transcribes correctly.

Now use an external audio editor or ffmpeg to trim the audio to less than a second and run again:

ffmpeg -i bindings/go/samples/jfk.wav -to 00:00:00.8 jfk-short.wav
./main --model models/ggml-medium.en.bin ./jfk-short.wav
- Not working. Mel shows some time (few ms) but encoder, decoder, prompt, and batchd times show 0ms, and there's no transcript. I tested on multiple clips to ensure it wasn't just this one, and I can consistently reproduce.

If you pad the audio to more than a second, transcripts appear again:

ffmpeg -i jfk-short.wav -af "apad=pad_dur=1" jfk-padded.wav
./main --model models/ggml-medium.en.bin ./jfk-padded.wav
- Working again! Shows the (cut off) transcript.

System Info

Tested on Ubuntu 22.04, amd64, both with and without cuBLAS enabled.

The text was updated successfully, but these errors were encountered:

bobqianic · 2023-12-07T17:07:34Z

whisper.cpp/whisper.cpp

Lines 5199 to 5202 in 3163090

    
           // if only 1 second left, then stop 
        
           if (seek + 100 >= seek_end) { 
        
               break; 
        
           }

jerzygangi · 2024-09-21T19:04:11Z

@bobqianic Are you aware if this bug was ever fixed?

bobqianic added the question Further information is requested label Dec 10, 2023

bobqianic mentioned this issue Dec 10, 2023

Server example? #1369

Open

bobqianic added the enhancement New feature or request label Jan 15, 2024

bobqianic linked a pull request Jan 15, 2024 that will close this issue

Fix the decoding issues #1768

Open

11 tasks

bobqianic removed a link to a pull request Feb 5, 2024

Fix the decoding issues #1768

Open

11 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Audio less than 1s long silently fails all transcription #1603

Audio less than 1s long silently fails all transcription #1603

isaac-mcfadyen commented Dec 7, 2023

bobqianic commented Dec 7, 2023

jerzygangi commented Sep 21, 2024

Audio less than 1s long silently fails all transcription #1603

Audio less than 1s long silently fails all transcription #1603

Comments

isaac-mcfadyen commented Dec 7, 2023

Issue

Reproduction

System Info

bobqianic commented Dec 7, 2023

jerzygangi commented Sep 21, 2024