Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cannot use multiple response modalities #194

Closed
emigre459 opened this issue Jan 24, 2025 · 1 comment
Closed

Cannot use multiple response modalities #194

emigre459 opened this issue Jan 24, 2025 · 1 comment
Labels
priority: p2 Moderately-important priority. Fix may not be included in next release. type: bug Error or flaw in code with unintended results or allowing sub-optimal usage patterns.

Comments

@emigre459
Copy link

emigre459 commented Jan 24, 2025

I'm trying to get Gemini to return both audio and a text version of what was returned, but asking for both audio and text causes the code to become non-responsive.

Environment details

  • Programming language: python
  • OS: OS X Sequoia 15.2
  • Language runtime version: 3.11.11
  • Package version: 0.5.0

Steps to reproduce

  1. Try adding "TEXT" as a modality in the list that is the value for key "response_modalities" in the CONFIG dictionary of this starter script.
    • `
  2. Speak to the model (no response expected)
  3. Type a message to model (generates an error, see stacktrace below)
    I observe the following stacktrace when attempting to perform the "Steps to reproduce":
+ Exception Group Traceback (most recent call last):
  |   File "live_api_starter_old.py", line 236, in run
  |     async with (
  |   File "/Users/<user>/.pyenv/versions/3.11.11/lib/python3.11/asyncio/taskgroups.py", line 145, in __aexit__
  |     raise me from None
  | ExceptionGroup: unhandled errors in a TaskGroup (1 sub-exception)
  +-+---------------- 1 ----------------
    | Traceback (most recent call last):
    |   File "live_api_starter_old.py", line 208, in receive_audio
    |     async for response in turn:
    |   File "/Users/<user>/Projects/roadtripAI/.venv/lib/python3.11/site-packages/google/genai/live.py", line 129, in receive
    |     while result := await self._receive():
    |                     ^^^^^^^^^^^^^^^^^^^^^
    |   File "/Users/<user>/Projects/roadtripAI/.venv/lib/python3.11/site-packages/google/genai/live.py", line 197, in _receive
    |     raw_response = await self._ws.recv(decode=False)
    |                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    |   File "/Users/<user>/Projects/roadtripAI/.venv/lib/python3.11/site-packages/websockets/asyncio/connection.py", line 313, in recv
    |     raise self.protocol.close_exc from self.recv_exc
    | websockets.exceptions.ConnectionClosedError: received 1007 (invalid frame payload data) Request trace id: e2e57544872855a0, [ORIGINAL ERROR] generic::invalid_argument: Error in program Instantiation for language; then sent 1007 (invalid frame payload data) Request trace id: e2e57544872855a0, [ORIGINAL ERROR] generic::invalid_argument: Error in program Instantiation for language
    +------------------------------------

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "live_api_starter_old.py", line 277, in <module>
    asyncio.run(main.run())
  File "/Users/<user>/.pyenv/versions/3.11.11/lib/python3.11/asyncio/runners.py", line 190, in run
    return runner.run(main)
           ^^^^^^^^^^^^^^^^
  File "/Users/<user>/.pyenv/versions/3.11.11/lib/python3.11/asyncio/runners.py", line 118, in run
    return self._loop.run_until_complete(task)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/<user>/.pyenv/versions/3.11.11/lib/python3.11/asyncio/base_events.py", line 654, in run_until_complete
    return future.result()
           ^^^^^^^^^^^^^^^
  File "live_api_starter_old.py", line 262, in run
    self.audio_stream.close()
    ^^^^^^^^^^^^^^^^^
AttributeError: 'AudioLoop' object has no attribute 'audio_stream'
@emigre459 emigre459 added priority: p2 Moderately-important priority. Fix may not be included in next release. type: bug Error or flaw in code with unintended results or allowing sub-optimal usage patterns. labels Jan 24, 2025
@sasha-gitg
Copy link
Member

The linked samples states:

While Gemini 2.0 Flash is in experimental preview mode, only one of AUDIO or
TEXT may be passed here.

https://github.com/google-gemini/cookbook/blob/main/gemini-2/live_api_starter.py#L82

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
priority: p2 Moderately-important priority. Fix may not be included in next release. type: bug Error or flaw in code with unintended results or allowing sub-optimal usage patterns.
Projects
None yet
Development

No branches or pull requests

2 participants