Release v0.1.4 · meta-llama/llama-stack

v0.1.4 Release Notes

Here are the key changes coming as part of this release:

Build and Test Agents

Inference: Added support for non-llama models
Inference: Added option to list all downloaded models and remove models
Agent: Introduce new api agents.resume_turn to include client side tool execution in the same turn
Agent: AgentConfig introduces new variable “tool_config” that allows for better tool configuration and system prompt overrides
Agent: Added logging for agent step start and completion times
Agent: Added support for logging for tool execution metadata
Embedding: Updated /inference/embeddings to support asymmetric models, truncation and variable sized outputs
Embedding: Updated embedding models for Ollama, Together, and Fireworks with available defaults
VectorIO: Improved performance of sqlite-vec using chunked writes

Agent Evals and Model Customization

Deprecated api /eval-tasks. Use /eval/benchmark instead
Added CPU training support for TorchTune

Deploy and Monitoring of Agents

Consistent view of client and server tool calls in telemetry

Better Engineering

Made tests more data-driven for consistent evaluation
Fixed documentation links and improved API reference generation
Various small fixes for build scripts and system reliability

What's Changed

build: resync uv and deps on 0.1.3 by @leseb in #1108
style: fix the capitalization issue by @reidliu41 in #1117
feat: log start, complete time to Agent steps by @ehhuang in #1116
fix: Ensure a tool call can be converted before adding to buffer by @terrytangyuan in #1119
docs: Fix incorrect link and command for generating API reference by @terrytangyuan in #1124
chore: remove --no-list-templates option by @reidliu41 in #1121
style: update verify-download help text by @reidliu41 in #1134
style: update download help text by @reidliu41 in #1135
fix: modify the model id title for model list by @reidliu41 in #1095
fix: direct client pydantic type casting by @yanxi0830 in #1145
style: remove prints in codebase by @yanxi0830 in #1146
feat: support tool_choice = {required, none, } by @ehhuang in #1059
test: Enable test_text_chat_completion_with_tool_choice_required for remote::vllm by @terrytangyuan in #1148
fix(rag-example): add provider_id to avoid llama_stack_client 400 error by @fulvius31 in #1114
fix: Get distro_codegen.py working with default deps and enabled in pre-commit hooks by @bbrowning in #1123
chore: remove llama_models.llama3.api imports from providers by @ashwinb in #1107
docs: fix Python llama_stack_client SDK links by @leseb in #1150
feat: Chunk sqlite-vec writes by @franciscojavierarceo in #1094
fix: miscellaneous job management improvements in torchtune by @booxter in #1136
feat: add aggregation_functions to llm_as_judge_405b_simpleqa by @SLR722 in #1164
feat: inference passthrough provider by @SLR722 in #1166
docs: Remove unused python-openapi and json-strong-typing in openapi_generator by @terrytangyuan in #1167
docs: improve API contribution guidelines by @leseb in #1137
feat: add a option to list the downloaded models by @reidliu41 in #1127
fix: Fixing some small issues with the build scripts by @franciscojavierarceo in #1132
fix: llama stack build use UV_SYSTEM_PYTHON to install dependencies to system environment by @yanxi0830 in #1163
build: add missing dev dependencies for unit tests by @leseb in #1004
fix: More robust handling of the arguments in tool call response in remote::vllm by @terrytangyuan in #1169
Added support for mongoDB KV store by @shrinitg in #543
script for running client sdk tests by @sixianyi0721 in #895
test: skip model registration for unsupported providers by @leseb in #1030
feat: Enable CPU training for torchtune by @booxter in #1140
fix: add logging import by @raspawar in #1174
docs: Add note about distro_codegen.py and provider dependencies by @bbrowning in #1175
chore: slight renaming of model alias stuff by @ashwinb in #1181
feat: adding endpoints for files and uploads by @vladimirivic in #1070
docs: Fix Links, Add Podman Instructions, Vector DB Unregister, and Example Script by @kevincogan in #1129
chore!: deprecate eval/tasks by @yanxi0830 in #1186
fix: some telemetry APIs don't currently work by @ehhuang in #1188
feat: D69478008 [llama-stack] turning tests into data-driven by @LESSuseLESS in #1180
feat: register embedding models for ollama, together, fireworks by @ashwinb in #1190
feat(providers): add NVIDIA Inference embedding provider and tests by @mattf in #935
docs: Add missing uv command for docs generation in contributing guide by @terrytangyuan in #1197
docs: Simplify installation guide with uv by @terrytangyuan in #1196
fix: BuiltinTool JSON serialization in remote vLLM provider by @bbrowning in #1183
ci: improve GitHub Actions workflow for website builds by @leseb in #1151
fix: pass tool_prompt_format to chat_formatter by @ehhuang in #1198
fix(api): update embeddings signature so inputs and outputs list align by @ashwinb in #1161
feat(api): Add options for supporting various embedding models by @ashwinb in #1192
fix: update URL import, URL -> ImageContentItemImageURL by @mattf in #1204
feat: model remove cmd by @reidliu41 in #1128
chore: remove configure subcommand by @reidliu41 in #1202
fix: remove list of list tests, no longer relevant after #1161 by @mattf in #1205
test(client-sdk): Update embedding test types to use latest imports by @raspawar in #1203
fix: convert back to model descriptor for model in list --downloaded by @reidliu41 in #1201
docs: Add missing uv command and clarify website rebuild by @terrytangyuan in #1199
fix: Updating images so that they are able to run without root access by @jland-redhat in #1208
fix: pull ollama embedding model if necessary by @ashwinb in #1209
chore: move embedding deps to RAG tool where they are needed by @ashwinb in #1210
feat(1/n): api: unify agents for handling server & client tools by @yanxi0830 in #1178
feat: tool outputs metadata by @ehhuang in #1155
ci: add mypy for static type checking by @leseb in #1101
feat(providers): support non-llama models for inference providers by @ashwinb in #1200
test: fix test_rag_agent test by @ehhuang in #1215
feat: add substring search for model list by @reidliu41 in #1099
test: do not overwrite agent_config by @ehhuang in #1216
docs: Adding Provider sections to docs by @franciscojavierarceo in #1195
fix: update virtualenv building so llamastack- prefix is not added, make notebook experience easier by @ashwinb in #1225
feat: add --run to llama stack build by @cdoern in #1156
docs: Add vLLM to the list of inference providers in concepts and providers pages by @terrytangyuan in #1227
docs: small fixes by @reidliu41 in #1224
fix: avoid failure when no special pip deps and better exit by @leseb in #1228
fix: set default tool_prompt_format in inference api by @ehhuang in #1214
test: fix test_tool_choice by @ehhuang in #1234

New Contributors

@fulvius31 made their first contribution in #1114
@shrinitg made their first contribution in #543
@raspawar made their first contribution in #1174
@kevincogan made their first contribution in #1129
@LESSuseLESS made their first contribution in #1180
@jland-redhat made their first contribution in #1208

Full Changelog: v0.1.3...v0.1.4

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

v0.1.4

v0.1.4 Release Notes

Build and Test Agents

Agent Evals and Model Customization

Deploy and Monitoring of Agents

Better Engineering

What's Changed

New Contributors

Contributors