-
-
Notifications
You must be signed in to change notification settings - Fork 2.3k
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
fix(main.py): fix retries being multiplied when using openai sdk (#7221)
* fix(main.py): fix retries being multiplied when using openai sdk Closes #7130 * docs(prompt_management.md): add langfuse prompt management doc * feat(team_endpoints.py): allow teams to add their own models Enables teams to call their own finetuned models via the proxy * test: add better enforcement check testing for `/model/new` now that teams can add their own models * docs(team_model_add.md): tutorial for allowing teams to add their own models * test: fix test
- Loading branch information
1 parent
8060c5c
commit ec36353
Showing
16 changed files
with
2,440 additions
and
1,541 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,83 @@ | ||
import Image from '@theme/IdealImage'; | ||
|
||
# Prompt Management | ||
|
||
LiteLLM supports using [Langfuse](https://langfuse.com/docs/prompts/get-started) for prompt management on the proxy. | ||
|
||
## Quick Start | ||
|
||
1. Add Langfuse as a 'callback' in your config.yaml | ||
|
||
```yaml | ||
model_list: | ||
- model_name: gpt-3.5-turbo | ||
litellm_params: | ||
model: azure/chatgpt-v-2 | ||
api_key: os.environ/AZURE_API_KEY | ||
api_base: os.environ/AZURE_API_BASE | ||
|
||
litellm_settings: | ||
callbacks: ["langfuse"] # 👈 KEY CHANGE | ||
``` | ||
2. Start the proxy | ||
```bash | ||
litellm-proxy --config config.yaml | ||
``` | ||
|
||
3. Test it! | ||
|
||
```bash | ||
curl -L -X POST 'http://0.0.0.0:4000/v1/chat/completions' \ | ||
-H 'Content-Type: application/json' \ | ||
-H 'Authorization: Bearer sk-1234' \ | ||
-d '{ | ||
"model": "gpt-4", | ||
"messages": [ | ||
{ | ||
"role": "user", | ||
"content": "THIS WILL BE IGNORED" | ||
} | ||
], | ||
"metadata": { | ||
"langfuse_prompt_id": "value", | ||
"langfuse_prompt_variables": { # [OPTIONAL] | ||
"key": "value" | ||
} | ||
} | ||
}' | ||
``` | ||
|
||
## What is 'langfuse_prompt_id'? | ||
|
||
- `langfuse_prompt_id`: The ID of the prompt that will be used for the request. | ||
|
||
<Image img={require('../../img/langfuse_prompt_id.png')} /> | ||
|
||
## What will the formatted prompt look like? | ||
|
||
### `/chat/completions` messages | ||
|
||
The message will be added to the start of the prompt. | ||
|
||
- if the Langfuse prompt is a list, it will be added to the start of the messages list (assuming it's an OpenAI compatible message). | ||
|
||
- if the Langfuse prompt is a string, it will be added as a system message. | ||
|
||
```python | ||
if isinstance(compiled_prompt, list): | ||
data["messages"] = compiled_prompt + data["messages"] | ||
else: | ||
data["messages"] = [ | ||
{"role": "system", "content": compiled_prompt} | ||
] + data["messages"] | ||
``` | ||
|
||
### `/completions` messages | ||
|
||
The message will be added to the start of the prompt. | ||
|
||
```python | ||
data["prompt"] = compiled_prompt + "\n" + data["prompt"] | ||
``` |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,77 @@ | ||
# Allow Teams to Add Models | ||
|
||
Allow team to add a their own models/key for that project - so any OpenAI call they make uses their OpenAI key. | ||
|
||
Useful for teams that want to call their own finetuned models. | ||
|
||
## Specify Team ID in `/model/add` endpoint | ||
|
||
|
||
```bash | ||
curl -L -X POST 'http://0.0.0.0:4000/model/new' \ | ||
-H 'Authorization: Bearer sk-******2ql3-sm28WU0tTAmA' \ # 👈 Team API Key (has same 'team_id' as below) | ||
-H 'Content-Type: application/json' \ | ||
-d '{ | ||
"model_name": "my-team-model", # 👈 Call LiteLLM with this model name | ||
"litellm_params": { | ||
"model": "openai/gpt-4o", | ||
"custom_llm_provider": "openai", | ||
"api_key": "******ccb07", | ||
"api_base": "https://my-endpoint-sweden-berri992.openai.azure.com", | ||
"api_version": "2023-12-01-preview" | ||
}, | ||
"model_info": { | ||
"team_id": "e59e2671-a064-436a-a0fa-16ae96e5a0a1" # 👈 Specify the team ID it belongs to | ||
} | ||
}' | ||
|
||
``` | ||
|
||
## Test it! | ||
|
||
```bash | ||
curl -L -X POST 'http://0.0.0.0:4000/v1/chat/completions' \ | ||
-H 'Content-Type: application/json' \ | ||
-H 'Authorization: Bearer sk-******2ql3-sm28WU0tTAmA' \ # 👈 Team API Key | ||
-d '{ | ||
"model": "my-team-model", # 👈 team model name | ||
"messages": [ | ||
{ | ||
"role": "user", | ||
"content": "What's the weather like in Boston today?" | ||
} | ||
] | ||
}' | ||
``` | ||
## Debugging | ||
### 'model_name' not found | ||
Check if model alias exists in team table. | ||
```bash | ||
curl -L -X GET 'http://localhost:4000/team/info?team_id=e59e2671-a064-436a-a0fa-16ae96e5a0a1' \ | ||
-H 'Authorization: Bearer sk-******2ql3-sm28WU0tTAmA' \ | ||
``` | ||
**Expected Response:** | ||
```json | ||
{ | ||
{ | ||
"team_id": "e59e2671-a064-436a-a0fa-16ae96e5a0a1", | ||
"team_info": { | ||
..., | ||
"litellm_model_table": { | ||
"model_aliases": { | ||
"my-team-model": # 👈 public model name "model_name_e59e2671-a064-436a-a0fa-16ae96e5a0a1_e81c9286-2195-4bd9-81e1-cf393788a1a0" 👈 internally generated model name (used to ensure uniqueness) | ||
}, | ||
"created_by": "default_user_id", | ||
"updated_by": "default_user_id" | ||
} | ||
}, | ||
} | ||
``` | ||
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Oops, something went wrong.