Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug]: How to configure Litellm to use direct Google API instead of Vertex AI for Gemini API calls? #8137

Open
Arijit-jktech opened this issue Jan 31, 2025 · 0 comments
Labels
bug Something isn't working

Comments

@Arijit-jktech
Copy link

Arijit-jktech commented Jan 31, 2025

What happened?

I am using litellm for both completion and embedding methods with Gemini models. However, when making API calls with the gemini model name, I am encountering an error indicating that the request is being routed through Vertex AI instead of directly calling the Google Gemini API.

Code Snippet
Here is the code I am using to invoke the gemini model:

from litellm import completion


response_object = completion(
                model="gemini/gemini-1.5-flash",
                prompt = "Hi, how are you?"
)
print(response_object)

Issue/Clarification Needed

  1. Is Litellm designed to use Vertex AI by default when calling Gemini models?
  2. Is there a way to configure Litellm to directly call the Google Gemini API instead of Vertex AI?
  3. If routing through Vertex AI is the intended behavior, are there any settings or configurations to bypass this?

Becuse vertex ai has significantly less rate limit than direct gemini api call, I want to use direct gemini call, because the same code snippet when i am running using the below code snippet it is working as expected.

import google.generativeai as genai
model = genai.GenerativeModel(model)
response = model.generate_content(prompt)
print(response)

Image

For reference: https://www.reddit.com/r/googlecloud/comments/1dr7at1/vertex_ai_api_vs_gemini_api/

Any insights on whether this is expected behaviour or a configuration issue would be appreciated. Thanks!

Relevant log output

litellm.exceptions.InternalServerError: litellm.InternalServerError: litellm.InternalServerError: VertexAIException - {
  "error": {
    "code": 503,
    "message": "The model is overloaded. Please try again later.",
    "status": "UNAVAILABLE"
  }
}

Are you a ML Ops Team?

Yes

What LiteLLM version are you on ?

1.58.2

Twitter / LinkedIn details

No response

@Arijit-jktech Arijit-jktech added the bug Something isn't working label Jan 31, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

1 participant