Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: Unregister a model from registry if not being served #1179

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

terrytangyuan
Copy link
Collaborator

@terrytangyuan terrytangyuan commented Feb 20, 2025

What does this PR do?

This ensures that the model is deleted from the registry if it's not actively being served by vLLM remote server.

Test Plan

All tests in tests/client-sdk/inference/test_text_inference.py passed.

LLAMA_STACK_BASE_URL=http://localhost:5002 pytest -v tests/client-sdk/inference/test_text_inference.py
=============================================================== test session starts ===============================================================
platform linux -- Python 3.10.16, pytest-8.3.4, pluggy-1.5.0 -- /home/yutang/.conda/envs/distribution-myenv/bin/python3.10
cachedir: .pytest_cache
rootdir: /home/yutang/repos/llama-stack
configfile: pyproject.toml
plugins: anyio-4.8.0
collected 16 items                                                                                                                                

tests/client-sdk/inference/test_text_inference.py::test_text_completion_non_streaming[meta-llama/Llama-3.1-8B-Instruct] PASSED              [  6%]
tests/client-sdk/inference/test_text_inference.py::test_text_completion_streaming[meta-llama/Llama-3.1-8B-Instruct] PASSED                  [ 12%]
tests/client-sdk/inference/test_text_inference.py::test_completion_log_probs_non_streaming[meta-llama/Llama-3.1-8B-Instruct] PASSED         [ 18%]
tests/client-sdk/inference/test_text_inference.py::test_completion_log_probs_streaming[meta-llama/Llama-3.1-8B-Instruct] PASSED             [ 25%]
tests/client-sdk/inference/test_text_inference.py::test_text_completion_structured_output[meta-llama/Llama-3.1-8B-Instruct] PASSED          [ 31%]
tests/client-sdk/inference/test_text_inference.py::test_text_chat_completion_non_streaming[meta-llama/Llama-3.1-8B-Instruct-Which planet do humans live on?-Earth] PASSED [ 37%]
tests/client-sdk/inference/test_text_inference.py::test_text_chat_completion_non_streaming[meta-llama/Llama-3.1-8B-Instruct-Which planet has rings around it with a name starting with letter S?-Saturn] PASSED [ 43%]
tests/client-sdk/inference/test_text_inference.py::test_text_chat_completion_streaming[meta-llama/Llama-3.1-8B-Instruct-What's the name of the Sun in latin?-Sol] PASSED [ 50%]
tests/client-sdk/inference/test_text_inference.py::test_text_chat_completion_streaming[meta-llama/Llama-3.1-8B-Instruct-What is the name of the US captial?-Washington] PASSED [ 56%]
tests/client-sdk/inference/test_text_inference.py::test_text_chat_completion_with_tool_calling_and_non_streaming[meta-llama/Llama-3.1-8B-Instruct] PASSED [ 62%]
tests/client-sdk/inference/test_text_inference.py::test_text_chat_completion_with_tool_calling_and_streaming[meta-llama/Llama-3.1-8B-Instruct] PASSED [ 68%]
tests/client-sdk/inference/test_text_inference.py::test_text_chat_completion_with_tool_choice_required[meta-llama/Llama-3.1-8B-Instruct] PASSED [ 75%]
tests/client-sdk/inference/test_text_inference.py::test_text_chat_completion_with_tool_choice_none[meta-llama/Llama-3.1-8B-Instruct] PASSED [ 81%]
tests/client-sdk/inference/test_text_inference.py::test_text_chat_completion_structured_output[meta-llama/Llama-3.1-8B-Instruct] PASSED     [ 87%]
tests/client-sdk/inference/test_text_inference.py::test_text_chat_completion_tool_calling_tools_not_in_request[meta-llama/Llama-3.1-8B-Instruct-True] PASSED [ 93%]
tests/client-sdk/inference/test_text_inference.py::test_text_chat_completion_tool_calling_tools_not_in_request[meta-llama/Llama-3.1-8B-Instruct-False] PASSED [100%]

==================================================== 16 passed, 1 warning in 586.72s (0:09:46) ====================================================

@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Meta Open Source bot. label Feb 20, 2025
@terrytangyuan
Copy link
Collaborator Author

terrytangyuan commented Feb 21, 2025

@ashwinb I saw a need to do this in Ollama as well (noticed your commit 36162c8) since we want to undo the model registration if it's not available. WDYT?

@ashwinb
Copy link
Contributor

ashwinb commented Feb 21, 2025

@terrytangyuan yeah I think we need to make this work better -- we should not put the model into the registry at all until the provider acknowledges successfully that they have registered it. I am going to refactor ModelRegistryHelper slightly (it has become a convoluted mess (with so little code) with badly named and behaved methods)

@terrytangyuan
Copy link
Collaborator Author

@ashwinb I just applied the same to Ollama as well. I think this will at least correct some existing behavior before your refactoring.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Meta Open Source bot.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants