Audio Timestamp option missing in Generation Config #4511

Waheguru-Anurag · 2024-10-05T10:04:21Z

Hi I was reading the vertex ai documentation - [https://cloud.google.com/vertex-ai/generative-ai/docs/multimodal/audio-understanding#:~:text=files%2C%20enable%20the-,audioTimestamp,-parameter%20in%20GenerationConfig]

Here it is mentioned:
2. Audio-only timestamps: To accurately generate timestamps for audio-only files, you must configure the audio_timestamp parameter in generation_config.

But I am not able to set this parameter in generation_config.

jaycee-li · 2024-10-07T23:10:11Z

Hi @Waheguru-Anurag, this field was recently added (last week) and is currently only available in the REST API. The Python SDK hasn't been updated to support it yet.

tfriedel · 2024-10-08T22:31:21Z

I tried this parameter using the REST API but didn't notice an improvement.
For a 3 min long file timestamps suggested it was over 4 minutes long.

I used this prompt:

Translate the audio to english. Include timestamps and speakers. Use the following format:

<example>
[00:17] Agent (male): Yes, sir. So, you have a shop that sells medicines, fertilizers, and seeds?
[00:19] Customer (male): Hmm.
[00:21] Agent (male): Sir, I have this app, sir, for retailers.
</example>

nithinsag · 2024-10-14T17:44:34Z

I have tried the rest api as well, didn't help as well.

product-auto-label bot added the api: vertex-ai Issues related to the googleapis/python-aiplatform API. label Oct 5, 2024

jaycee-li self-assigned this Oct 7, 2024

APPXOTICA mentioned this issue Oct 20, 2024

Gemini 1.5 Flash 002 Hallucinates Timestamps when transcribing audio google-gemini/generative-ai-js#269

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Audio Timestamp option missing in Generation Config #4511

Audio Timestamp option missing in Generation Config #4511

Waheguru-Anurag commented Oct 5, 2024

jaycee-li commented Oct 7, 2024

tfriedel commented Oct 8, 2024

nithinsag commented Oct 14, 2024

Audio Timestamp option missing in Generation Config #4511

Audio Timestamp option missing in Generation Config #4511

Comments

Waheguru-Anurag commented Oct 5, 2024

jaycee-li commented Oct 7, 2024

tfriedel commented Oct 8, 2024

nithinsag commented Oct 14, 2024