Cannot use multiple response modalities #194

emigre459 · 2025-01-24T21:41:00Z

I'm trying to get Gemini to return both audio and a text version of what was returned, but asking for both audio and text causes the code to become non-responsive.

Environment details

Programming language: python
OS: OS X Sequoia 15.2
Language runtime version: 3.11.11
Package version: 0.5.0

Steps to reproduce

Try adding "TEXT" as a modality in the list that is the value for key "response_modalities" in the CONFIG dictionary of this starter script.
- `
Speak to the model (no response expected)
Type a message to model (generates an error, see stacktrace below)
I observe the following stacktrace when attempting to perform the "Steps to reproduce":

+ Exception Group Traceback (most recent call last):
  |   File "live_api_starter_old.py", line 236, in run
  |     async with (
  |   File "/Users/<user>/.pyenv/versions/3.11.11/lib/python3.11/asyncio/taskgroups.py", line 145, in __aexit__
  |     raise me from None
  | ExceptionGroup: unhandled errors in a TaskGroup (1 sub-exception)
  +-+---------------- 1 ----------------
    | Traceback (most recent call last):
    |   File "live_api_starter_old.py", line 208, in receive_audio
    |     async for response in turn:
    |   File "/Users/<user>/Projects/roadtripAI/.venv/lib/python3.11/site-packages/google/genai/live.py", line 129, in receive
    |     while result := await self._receive():
    |                     ^^^^^^^^^^^^^^^^^^^^^
    |   File "/Users/<user>/Projects/roadtripAI/.venv/lib/python3.11/site-packages/google/genai/live.py", line 197, in _receive
    |     raw_response = await self._ws.recv(decode=False)
    |                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    |   File "/Users/<user>/Projects/roadtripAI/.venv/lib/python3.11/site-packages/websockets/asyncio/connection.py", line 313, in recv
    |     raise self.protocol.close_exc from self.recv_exc
    | websockets.exceptions.ConnectionClosedError: received 1007 (invalid frame payload data) Request trace id: e2e57544872855a0, [ORIGINAL ERROR] generic::invalid_argument: Error in program Instantiation for language; then sent 1007 (invalid frame payload data) Request trace id: e2e57544872855a0, [ORIGINAL ERROR] generic::invalid_argument: Error in program Instantiation for language
    +------------------------------------

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "live_api_starter_old.py", line 277, in <module>
    asyncio.run(main.run())
  File "/Users/<user>/.pyenv/versions/3.11.11/lib/python3.11/asyncio/runners.py", line 190, in run
    return runner.run(main)
           ^^^^^^^^^^^^^^^^
  File "/Users/<user>/.pyenv/versions/3.11.11/lib/python3.11/asyncio/runners.py", line 118, in run
    return self._loop.run_until_complete(task)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/<user>/.pyenv/versions/3.11.11/lib/python3.11/asyncio/base_events.py", line 654, in run_until_complete
    return future.result()
           ^^^^^^^^^^^^^^^
  File "live_api_starter_old.py", line 262, in run
    self.audio_stream.close()
    ^^^^^^^^^^^^^^^^^
AttributeError: 'AudioLoop' object has no attribute 'audio_stream'

The text was updated successfully, but these errors were encountered:

sasha-gitg · 2025-02-03T22:34:05Z

The linked samples states:

While Gemini 2.0 Flash is in experimental preview mode, only one of AUDIO or
TEXT may be passed here.

https://github.com/google-gemini/cookbook/blob/main/gemini-2/live_api_starter.py#L82

emigre459 added priority: p2 Moderately-important priority. Fix may not be included in next release. type: bug Error or flaw in code with unintended results or allowing sub-optimal usage patterns. labels Jan 24, 2025

sasha-gitg closed this as completed Feb 3, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Cannot use multiple response modalities #194

Cannot use multiple response modalities #194

emigre459 commented Jan 24, 2025 •

edited

Loading

sasha-gitg commented Feb 3, 2025

Cannot use multiple response modalities #194

Cannot use multiple response modalities #194

Comments

emigre459 commented Jan 24, 2025 • edited Loading

Environment details

Steps to reproduce

sasha-gitg commented Feb 3, 2025

emigre459 commented Jan 24, 2025 •

edited

Loading