Issue changing default voice in Gemini2 live api #378

notnotrishi · 2024-12-20T01:58:52Z

Description of the bug:

I'm using the code in live_api_starter.py to test out the multimodal live api and build from there. The basic code works. Based on the documentation in https://ai.google.dev/api/multimodal-live#sessions I am attempting to change the default voice but it seems to be not working if I use the format mentioned in the API documentation:

My code snippet:

MODEL = "models/gemini-2.0-flash-exp"

MODE = args.mode

client = genai.Client(http_options={"api_version": "v1alpha"})

CONFIG = {
    "generation_config": {
        "response_modalities": ["AUDIO"],
        "speech_config": {
            "voice_config": {
                "prebuilt_voice_config": {
                    "voice_name": "Charon"
                }
            }
        },
    }
}

Error message:

Traceback (most recent call last):
File "/Users/rishi/Projects/gemini_live/new.py", line 393, in
asyncio.run(main.run())
File "/Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/asyncio/runners.py", line 190, in run
return runner.run(main)
^^^^^^^^^^^^^^^^
File "/Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/asyncio/runners.py", line 118, in run
return self._loop.run_until_complete(task)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/asyncio/base_events.py", line 653, in run_until_complete
return future.result()
^^^^^^^^^^^^^^^
File "/Users/rishi/Projects/gemini_live/new.py", line 362, in run
async with (
File "/Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/contextlib.py", line 204, in aenter
return await anext(self.gen)
^^^^^^^^^^^^^^^^^^^^^
File "/Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/site-packages/google/genai/live.py", line 626, in connect
self._LiveSetup_to_mldev(model=transformed_model, config=config)
File "/Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/site-packages/google/genai/live.py", line 459, in _LiveSetup_to_mldev
_GenerateContentConfig_to_mldev(
File "/Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/site-packages/google/genai/models.py", line 894, in _GenerateContentConfig_to_mldev
t.t_speech_config(api_client, getv(from_object, ['speech_config'])),
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/site-packages/google/genai/_transformers.py", line 301, in t_speech_config
raise ValueError(f'Unsupported speechConfig type: {type(origin)}')
ValueError: Unsupported speechConfig type: <class 'dict'>

Actual vs expected behavior:

No response

Any other information you'd like to share?

No response

The text was updated successfully, but these errors were encountered:

ThashilNaidoo · 2024-12-20T16:12:26Z

I had the same issue. Try this instead. It worked for me.

"generation_config": { "response_modalities": ["AUDIO"], "speech_config": "Charon" }

notnotrishi · 2024-12-20T19:55:49Z

, "speech_config": "Charon"

I had the same issue. Try this instead. It worked for me.

"generation_config": { "response_modalities": ["AUDIO"], "speech_config": "Charon" }

that works, thanks @ThashilNaidoo !

but i also noticed the report is triaged as a bug so hopefully they can fix the issue and/or clarify which one is the correct usage

MarkDaoust · 2025-01-09T00:26:18Z

Hi, thanks for reporting this. I just looked into it and this is fixed in the latest release (0.4).

Both methods work now.

brandonwheat · 2025-01-10T16:02:56Z

Hi, thanks for reporting this. I just looked into it and this is fixed in the latest release (0.4).

Both methods work now.

I tried both and neither are working for me

Invalid JSON payload received. Unknown name "prebuilt_voice_config " at 'setup.generati; then sent 1007 (invalid frame payload data) Request trace id: 74133fd7e2c07425, Invalid JSON payload received. Unknown name "prebuilt_voice_config " at 'setup.generati

conn = await es.enter_async_context(
connect(
f'wss://{HOST}/ws/google.ai.generativelanguage.v1alpha.GenerativeService.BidiGenerateContent?key={API_KEY}')
)
print('')

    initial_request = {
        'setup': {
            'model': MODEL,
            'system_instruction': {
                "parts": [
                    {
                        "text": SYSTEM_MESSAGE
                    }
                ]
        },
            "tools":  {'function_declarations': [pay_bill_tool, get_quote_tool]},
            "generation_config": {
                "response_modalities": ["AUDIO"],
                "speech_config": {
                      "voice_config": {
                        "prebuilt_voice_config ": {
                          "voice_name": "Puck"
                        }
                      }
                }
            }
        },
    }

gmKeshari added type:bug Something isn't working status:triaged Issue/PR triaged to the corresponding sub-team component:examples Issues/PR referencing examples folder labels Dec 20, 2024

Giom-V assigned Giom-V, markmcd and MarkDaoust and unassigned Giom-V Dec 21, 2024

github-actions bot mentioned this issue Jan 1, 2025

Monthly issue metrics report markmcd/gemini-api-cookbook#10

Open

MarkDaoust closed this as completed Jan 9, 2025

MarkDaoust reopened this Jan 10, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Issue changing default voice in Gemini2 live api #378

Issue changing default voice in Gemini2 live api #378

notnotrishi commented Dec 20, 2024

ThashilNaidoo commented Dec 20, 2024

notnotrishi commented Dec 20, 2024

MarkDaoust commented Jan 9, 2025

brandonwheat commented Jan 10, 2025

Issue changing default voice in Gemini2 live api #378

Issue changing default voice in Gemini2 live api #378

Comments

notnotrishi commented Dec 20, 2024

Description of the bug:

Actual vs expected behavior:

Any other information you'd like to share?

ThashilNaidoo commented Dec 20, 2024

notnotrishi commented Dec 20, 2024

MarkDaoust commented Jan 9, 2025

brandonwheat commented Jan 10, 2025