Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added chat & completions api #10

Open
wants to merge 13 commits into
base: master
Choose a base branch
from
Open

Conversation

ksriv001
Copy link

No description provided.

@ksriv001 ksriv001 self-assigned this Jan 22, 2025
@ksriv001 ksriv001 changed the title Added chat api Added chat & completions api Jan 22, 2025
@ksriv001 ksriv001 requested a review from rreece January 23, 2025 19:20
"stream": stream,
}

base_url = "https://vllm-tt-dev-49305ac9.workload.tenstorrent.com/v1"
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should create an ingress route that would be standard for EdenAI vllm deployment. And the workload that I tested is currently in vllm-tt-dev in tenstorrent team, but I'll move it to EdenAI team.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we might need to add some taints on the LLMBoxes in k8 that we specifically need to reserve for Eden maybe?

"max_tokens": max_tokens,
}

base_url = "https://vllm-tt-dev-49305ac9.workload.tenstorrent.com/v1"
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should create an ingress route that would be standard for EdenAI vllm deployment. And the workload that I tested is currently in vllm-tt-dev in tenstorrent team, but I'll move it to EdenAI team

Copy link
Member

@rreece rreece left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great progress. Here's some thoughts.

"stream": stream,
}

base_url = "https://vllm-tt-dev-49305ac9.workload.tenstorrent.com/v1"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes

edenai_apis/apis/tenstorrent/tenstorrent_text_api.py Outdated Show resolved Hide resolved
}

base_url = "https://vllm-tt-dev-49305ac9.workload.tenstorrent.com/v1"
client = OpenAI(base_url=base_url,api_key=self.api_key)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like that you are using the openai client. I'm not sure the client should be redefined on every call like this.

Remember that when the client is defined, that initializes a session. If we want sticky sessions to work. The same client needs to be re-used by the same user.

Note that in the example edenai-api by openai you shared earlier also uses the client, and the client is saved to the class as a member of self. I'm actually confused where the client is initilized:

https://github.com/edenai/edenai-apis/blob/aa32409084469b81be98989964b0ab82d7326ccc/edenai_apis/apis/openai/openai_text_api.py#L691

Can you figure out where the client is initialized, and we should probably do similar?

Also, just noting for comparison that the openai implementation of text__generation does not use the client and just uses the requests module, but I think we should use the client:

https://github.com/edenai/edenai-apis/blob/aa32409084469b81be98989964b0ab82d7326ccc/edenai_apis/apis/openai/openai_text_api.py#L472

Copy link
Author

@ksriv001 ksriv001 Jan 23, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ahh, great point! You are right that if we want to leverage the sticky sessions/kv-cache reuse, then we probably shouldnt instantiate client again and again.

After looking a bit on how OpenAI did it, the client for the text__chat method in the OpenaiTextApi class is initialized in the OpenaiApi class constructor here:
https://github.com/edenai/edenai-apis/blob/aa32409084469b81be98989964b0ab82d7326ccc/edenai_apis/apis/openai/openai_api.py#L55

This happens during the instantiation of the OpenaiApi object in openai_api.py. The OpenaiApi class inherits from OpenaiTextApi, so the self.client object will be accessible to text__chat.

I'll try to modify along similar lines.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

Copy link
Author

@ksriv001 ksriv001 Jan 23, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the other note you mentioned about openai using request for text__generation and not the OpenAI client, our tenstorrent code is already using the OpenAI client for both text__chat and text__generation.

@ksriv001
Copy link
Author

@rreece One thing I am confused about is whether the name of this file be set to tenstorrent_settings.json rather than tenstorrent_template_settings.json?
I am asking this since the instructions here mention that the name of the file should follow the pattern provider_settings.json, and infact, this is what I have used while testing, and manually hardcoding the value of the api_key key in the json.
Do you know what it should be?

@rreece
Copy link
Member

rreece commented Jan 23, 2025

@rreece One thing I am confused about is whether the name of this file be set to tenstorrent_settings.json rather than tenstorrent_template_settings.json? I am asking this since the instructions here mention that the name of the file should follow the pattern provider_settings.json, and infact, this is what I have used while testing, and manually hardcoding the value of the api_key key in the json. Do you know what it should be?

Similar to the other examples from other providers, the name of the file saved to the repo should be tenstorrent_settings_template.json, but then it should be copied to tenstorrent_settings.json and the actual key added to during testing. But tenstorrent_settings.json should never be added to the repo. Only the template should be in the repo.

@ksriv001
Copy link
Author

One thing I am confused with is whether the name of this file

@rreece One thing I am confused about is whether the name of this file be set to tenstorrent_settings.json rather than tenstorrent_template_settings.json? I am asking this since the instructions here mention that the name of the file should follow the pattern provider_settings.json, and infact, this is what I have used while testing, and manually hardcoding the value of the api_key key in the json. Do you know what it should be?

Similar to the other examples from other providers, the name of the file saved to the repo should be tenstorrent_settings_template.json, but then it should be copied to tenstorrent_settings.json and the actual key added to during testing. But tenstorrent_settings.json should never be added to the repo. Only the template should be in the repo.

Cool, sounds like I have been doing it the right way then. Thanks for confirming.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants