Release v0.1.3 · meta-llama/llama-stack

v0.1.3 Release

Here are some key changes that are coming as part of this release.

Build and Test Agents

Streamlined the initial development experience

Added support for llama stack run --image-type venv
Enhanced vector store options with new sqlite-vec provider and improved Qdrant integration
vLLM improvements for tool calling and logprobs
Better handling of sporadic code_interpreter tool calls

Agent Evals

Better benchmarking and Agent performance assessment

Renamed eval API /eval-task to /benchmarks
Improved documentation and notebooks for RAG and evals

Deploy and Monitoring of Agents

Improved production readiness

Added usage metrics collection for chat completions
CLI improvements for provider information
Improved error handling and system reliability
Better model endpoint handling and accessibility
Improved signal handling on distro server

Better Engineering

Infrastructure and code quality improvements

Faster text-based chat completion tests
Improved testing for non-streaming agent apis
Standardized import formatting with ruff linter
Added conventional commits standard
Fixed documentation parsing issues

What's Changed

Getting started notebook update by @jeffxtang in #936
docs: update index.md for 0.1.2 by @raghotham in #1013
test: Make text-based chat completion tests run 10x faster by @terrytangyuan in #1016
chore: Updated requirements.txt by @cheesecake100201 in #1017
test: Use JSON tool prompt format for remote::vllm provider by @terrytangyuan in #1019
docs: Render check marks correctly on PyPI by @terrytangyuan in #1024
docs: update rag.md example code to prevent errors by @MichaelClifford in #1009
build: update uv lock to sync package versions by @leseb in #1026
fix: Gaps in doc codegen by @ellistarn in #1035
fix: Readthedocs cannot parse comments, resulting in docs bugs by @ellistarn in #1033
fix: a bad newline in ollama docs by @ellistarn in #1036
fix: Update Qdrant support post-refactor by @jwm4 in #1022
test: replace blocked image URLs with GitHub-hosted by @leseb in #1025
fix: Added missing tool_config arg in SambaNova chat_completion() by @terrytangyuan in #1042
docs: Updating wording and nits in the README.md by @kelbrown20 in #992
docs: remove changelog mention from PR template by @leseb in #1049
docs: reflect actual number of spaces for indent by @booxter in #1052
fix: agent config validation by @ehhuang in #1053
feat: add MetricResponseMixin to chat completion response types by @dineshyv in #1050
feat: make telemetry attributes be dict[str,PrimitiveType] by @dineshyv in #1055
fix: filter out remote::sample providers when listing by @booxter in #1057
feat: Support tool calling for non-streaming chat completion in remote vLLM provider by @terrytangyuan in #1034
perf: ensure ToolCall in ChatCompletionResponse is subset of ChatCompletionRequest.tools by @yanxi0830 in #1041
chore: update return type to Optional[str] by @leseb in #982
feat: Support tool calling for streaming chat completion in remote vLLM provider by @terrytangyuan in #1063
fix: show proper help text by @cdoern in #1065
feat: add support for running in a venv by @cdoern in #1018
feat: Adding sqlite-vec as a vectordb by @franciscojavierarceo in #1040
feat: support listing all for llama stack list-providers by @booxter in #1056
docs: Mention convential commits format in CONTRIBUTING.md by @bbrowning in #1075
fix: logprobs support in remote-vllm provider by @bbrowning in #1074
fix: improve signal handling and update dependencies by @leseb in #1044
style: update model id in model list title by @reidliu41 in #1072
fix: make backslash work in GET /models/{model_id:path} by @yanxi0830 in #1068
chore: Link to Groq docs in the warning message for preview model by @terrytangyuan in #1060
fix: remove :path in agents by @yanxi0830 in #1077
build: format codebase imports using ruff linter by @leseb in #1028
chore: Consistent naming for VectorIO providers by @terrytangyuan in #1023
test: Enable logprobs top_k tests for remote::vllm by @terrytangyuan in #1080
docs: Fix url to the llama-stack-spec yaml/html files by @vishnoianil in #1081
fix: Update VectorIO config classes in registry by @terrytangyuan in #1079
test: Add qdrant to provider tests by @jwm4 in #1039
test: add test for Agent.create_turn non-streaming response by @ehhuang in #1078
fix!: update eval-tasks -> benchmarks by @yanxi0830 in #1032
fix: openapi for eval-task by @yanxi0830 in #1085
fix: regex pattern matching to support :path suffix in the routes by @hardikjshah in #1089
fix: disable sqlite-vec test by @yanxi0830 in #1090
fix: add the missed help description info by @reidliu41 in #1096
fix: Update QdrantConfig to QdrantVectorIOConfig by @bbrowning in #1104
docs: Add region parameter to Bedrock provider by @raghotham in #1103
build: configure ruff from pyproject.toml by @leseb in #1100
chore: move all Llama Stack types from llama-models to llama-stack by @ashwinb in #1098
fix: enable_session_persistence in AgentConfig should be optional by @terrytangyuan in #1012
fix: improve stack build on venv by @leseb in #980
fix: remove the empty line by @reidliu41 in #1097

New Contributors

@MichaelClifford made their first contribution in #1009
@ellistarn made their first contribution in #1035
@kelbrown20 made their first contribution in #992
@franciscojavierarceo made their first contribution in #1040
@bbrowning made their first contribution in #1075
@reidliu41 made their first contribution in #1072
@vishnoianil made their first contribution in #1081

Full Changelog: v0.1.2...v0.1.3

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

v0.1.3