-
Notifications
You must be signed in to change notification settings - Fork 889
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Inference Pytest Failed at text_chat_completion_with_tool_calling scenarios #1185
Comments
Besides, it would be good to add the description of the package dependencies. For me, to be able to run the test, I had to install the following packages:
|
Your vLLM server has not enabled tool call. |
Thanks for the quick response! What is the command to enable the tool call in vLLM through Docker? I didn't see it described in Llama Stack documentation. In the vLLM documentation here, they only show the vLLM command level arguments. While I tried to add the same arguments into the docker command:
I got the following error:
I do have the jinja file at the location. How about other inferencing engine like TGI? Do we need purposely enable the tool calling capability? Pytorch flow does not seem to need it. |
Tested that TGI does not have this issue. There is some information like |
System Info
CUDA: 12.4 / Driver: 550.120 / 1xH100
Information
🐛 Describe the bug
Pytest for host running vLLM and PGvector. All the tool-calling cases failed but the others passed. The error message includes SQLite write error, but it may not be related to the issue as the other cases passed with this error. Want to understand the root cause.
Command:
The following is the yaml file:
Error logs
Expected behavior
All test cases pass for text inferencing.
The text was updated successfully, but these errors were encountered: