Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: Get builtin tool calling working in remote-vllm #1236

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

bbrowning
Copy link
Contributor

What does this PR do?

This PR makes a couple of changes required to get the test tests/client-sdk/agents/test_agents.py::test_builtin_tool_web_search passing on the remote-vllm provider.

First, we adjust agent_instance to also pass in the description and parameters of builtin tools. We need these parameters so we can pass the tool's expected parameters into vLLM. The meta-reference implementations may not have needed these for builtin tools, as they are able to take advantage of the Llama-model specific support for certain builtin tools. However, with vLLM, our server-side chat templates for tool calling treat all tools the same and don't separate out Llama builtin vs custom tools. So, we need to pass the full set of parameter definitions and list of required parameters for builtin tools as well.

Next, we adjust the vllm streaming chat completion code to fix up some edge cases where it was returning an extra ChatCompletionResponseEvent with an empty ToolCall with empty string call_id, tool_name, and arguments properties. This is a bug discovered after the above fix, where after a successful tool invocation we were sending extra chunks back to the client with these empty ToolCalls.

Test Plan

With these changes, the following test that previously failed now passes:

VLLM_URL="http://localhost:8000/v1" \
INFERENCE_MODEL="meta-llama/Llama-3.2-3B-Instruct" \
LLAMA_STACK_CONFIG=remote-vllm \
python -m pytest -v \
tests/client-sdk/agents/test_agents.py::test_builtin_tool_web_search \
--inference-model "meta-llama/Llama-3.2-3B-Instruct"

Additionally, I ran the remote-vllm client-sdk and provider inference tests as below to ensure they all still passed with this change:

VLLM_URL="http://localhost:8000/v1" \
INFERENCE_MODEL="meta-llama/Llama-3.2-3B-Instruct" \
LLAMA_STACK_CONFIG=remote-vllm \
python -m pytest -v \
tests/client-sdk/inference/test_text_inference.py \
--inference-model "meta-llama/Llama-3.2-3B-Instruct"
VLLM_URL="http://localhost:8000/v1" \
python -m pytest -s -v \
llama_stack/providers/tests/inference/test_text_inference.py \
--providers "inference=vllm_remote"

This PR makes a couple of changes required to get the test
`tests/client-sdk/agents/test_agents.py::test_builtin_tool_web_search`
passing on the remote-vllm provider.

First, we adjust agent_instance to also pass in the description and
parameters of builtin tools. We need these parameters so we can pass
the tool's expected parameters into vLLM. The meta-reference
implementations may not have needed these for builtin tools, as they
are able to take advantage of the Llama-model specific support for
certain builtin tools. However, with vLLM, our server-side chat
templates for tool calling treat all tools the same and don't separate
out Llama builtin vs custom tools. So, we need to pass the full set of
parameter definitions and list of required parameters for builtin
tools as well.

Next, we adjust the streaming chat completion code to fix up some edge
cases where it was returning an extra ChatCompletionResponseEvent with
an empty ToolCall with empty string call_id, tool_name, and arguments
properties. This is a bug discovered after the above fix, where after
a successful tool invocation we were sending extra chunks back to the
client with these empty ToolCalls.

With these changes, the following test that previously failed now
passes:

```
VLLM_URL="http://localhost:8000/v1" \
INFERENCE_MODEL="meta-llama/Llama-3.2-3B-Instruct" \
LLAMA_STACK_CONFIG=remote-vllm \
python -m pytest -v \
tests/client-sdk/agents/test_agents.py::test_builtin_tool_web_search \
--inference-model "meta-llama/Llama-3.2-3B-Instruct"
```

Additionally, I ran the remote-vllm client-sdk and provider inference
tests as below to ensure they all still passed with this change:

```
VLLM_URL="http://localhost:8000/v1" \
INFERENCE_MODEL="meta-llama/Llama-3.2-3B-Instruct" \
LLAMA_STACK_CONFIG=remote-vllm \
python -m pytest -v \
tests/client-sdk/inference/test_text_inference.py \
--inference-model "meta-llama/Llama-3.2-3B-Instruct"
```

```
VLLM_URL="http://localhost:8000/v1" \
python -m pytest -s -v \
llama_stack/providers/tests/inference/test_text_inference.py \
--providers "inference=vllm_remote"
```

Signed-off-by: Ben Browning <[email protected]>
@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Meta Open Source bot. label Feb 24, 2025
@@ -905,7 +905,19 @@ async def _get_tool_defs(
if tool_def_map.get(built_in_type, None):
raise ValueError(f"Tool {built_in_type} already exists")

tool_def_map[built_in_type] = ToolDefinition(tool_name=built_in_type)
tool_def_map[built_in_type] = ToolDefinition(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ashwinb @yanxi0830 Could you help double check on this change?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure if having additional fields here would break things (seems not?), but if not, looks like we can just set the key value here for tool_def_map then L926 below should take care fo this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Meta Open Source bot.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants