How to reduce Whisper hallucinations #198

pr-data-port · 2024-07-20T15:56:19Z

When I ran 15min audio through whisper timestamped model "", which is from an interview - it randomly shows completely unrelated text in the audio that was played. Has anyone ever see such an issue?

Jeronymous · 2024-07-22T08:33:23Z

Something is missing in this description.
Please clarify which model you are running, with either the command or the python code you run.

pr-data-port · 2024-07-22T13:42:52Z

@Jeronymous this is the code I called:

device = "cuda:0" if torch.cuda.is_available() else "cpu"
model_id = "openai/whisper-large-v3"
model = whisper_timestamped.load_model(model_id, device=device)

result = whisper_timestamped.transcribe(
    model, 
    audio, 
    language="en", 
    beam_size=5, 
    best_of=5, 
    temperature=0.4,
    no_speech_threshold=0.6
)

print(json.dumps(result, indent=2, ensure_ascii = False))

(changing thetemperature to temperature=(0.2, 0.4, 0.6, 0.8, 1.0) didn't gave any improvements)
and the output I got was this:

{
  "text": " Hello everyone! Today I will show you how to make a super easy and easy to make Christmas tree Christmas tree is a very simple and very easy to make and very easy to make Christmas tree is a very simple and very easy to make Christmas tree is a very simple and very easy to make Christmas tree is a very simple and very easy to make Christmas tree is a very simple and very easy to make Christmas tree is a very simple and very easy to make Christmas tree is a very simple and very easy to make Christmas tree is a very simple Christmas tree is a very simple Christmas tree is a very simple Christmas tree is a very simple Christmas tree is a very simple Christmas tree is a very simple Christmas tree is a very simple Christmas tree is a very simple Christmas tree is a very simple Christmas tree is a very simple Christmas tree is a very simple Christmas tree is a very simple Christmas tree is a very simple Christmas tree is a very simple Christmas tree is a very simple Christmas tree is a very simple Christmas tree is a very simple Christmas tree is a very simple Christmas tree is a very simple Christmas tree is a very simple Christmas tree is a very simple Christmas tree is a very simple Christmas tree is a very simple Christmas tree is a very simple Christmas tree is a very simple Christmas tree is a very simple Christmas tree is a very simple Christmas tree is a very simple Christmas tree [...]

skipping the segment and word level timestamps as I am not sure they are relevant for this situation.
The audio is about something completely different and not even remotely related to Christmas :D

Jeronymous · 2024-07-22T14:11:08Z

OK it seems the model hallucinates completely.
I wonder why you don't use temperature 0.
Maybe try with "temperature=0" ...

Otherwise, I would need the audio to investigate more.

Maybe the volume level is particularly low ?

You can also try option vad="auditok" or vad="silero", to try to avoid hallucination, as well as "condition_on_previous_text=False"

pr-data-port · 2024-07-23T08:19:08Z

Thank you @Jeronymous changing the temperature and adding vad="auditok", did indeed help at least it shows correct text, just generates loads of duplications still. I am sceptical on adding the condition_on_previous_text=False, because that previously helped remove duplicates, but the error rate rose significantly.

Jeronymous added the help wanted Extra attention is needed label Jul 24, 2024

Jeronymous changed the title ~~Audio~~ How to reduce Whisper hallucinations Jul 24, 2024

This comment was marked as abuse.

Sign in to view

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

How to reduce Whisper hallucinations #198

How to reduce Whisper hallucinations #198

pr-data-port commented Jul 20, 2024

Jeronymous commented Jul 22, 2024

pr-data-port commented Jul 22, 2024

Jeronymous commented Jul 22, 2024

pr-data-port commented Jul 23, 2024

This comment was marked as abuse.

How to reduce Whisper hallucinations #198

How to reduce Whisper hallucinations #198

Comments

pr-data-port commented Jul 20, 2024

Jeronymous commented Jul 22, 2024

pr-data-port commented Jul 22, 2024

Jeronymous commented Jul 22, 2024

pr-data-port commented Jul 23, 2024

This comment was marked as abuse.