v0.1.3
v0.1.3 Release
Here are some key changes that are coming as part of this release.
Build and Test Agents
Streamlined the initial development experience
- Added support for llama stack run --image-type venv
- Enhanced vector store options with new sqlite-vec provider and improved Qdrant integration
- vLLM improvements for tool calling and logprobs
- Better handling of sporadic code_interpreter tool calls
Agent Evals
Better benchmarking and Agent performance assessment
- Renamed eval API /eval-task to /benchmarks
- Improved documentation and notebooks for RAG and evals
Deploy and Monitoring of Agents
Improved production readiness
- Added usage metrics collection for chat completions
- CLI improvements for provider information
- Improved error handling and system reliability
- Better model endpoint handling and accessibility
- Improved signal handling on distro server
Better Engineering
Infrastructure and code quality improvements
- Faster text-based chat completion tests
- Improved testing for non-streaming agent apis
- Standardized import formatting with ruff linter
- Added conventional commits standard
- Fixed documentation parsing issues
What's Changed
- Getting started notebook update by @jeffxtang in #936
- docs: update index.md for 0.1.2 by @raghotham in #1013
- test: Make text-based chat completion tests run 10x faster by @terrytangyuan in #1016
- chore: Updated requirements.txt by @cheesecake100201 in #1017
- test: Use JSON tool prompt format for remote::vllm provider by @terrytangyuan in #1019
- docs: Render check marks correctly on PyPI by @terrytangyuan in #1024
- docs: update rag.md example code to prevent errors by @MichaelClifford in #1009
- build: update uv lock to sync package versions by @leseb in #1026
- fix: Gaps in doc codegen by @ellistarn in #1035
- fix: Readthedocs cannot parse comments, resulting in docs bugs by @ellistarn in #1033
- fix: a bad newline in ollama docs by @ellistarn in #1036
- fix: Update Qdrant support post-refactor by @jwm4 in #1022
- test: replace blocked image URLs with GitHub-hosted by @leseb in #1025
- fix: Added missing
tool_config
arg in SambaNovachat_completion()
by @terrytangyuan in #1042 - docs: Updating wording and nits in the README.md by @kelbrown20 in #992
- docs: remove changelog mention from PR template by @leseb in #1049
- docs: reflect actual number of spaces for indent by @booxter in #1052
- fix: agent config validation by @ehhuang in #1053
- feat: add MetricResponseMixin to chat completion response types by @dineshyv in #1050
- feat: make telemetry attributes be dict[str,PrimitiveType] by @dineshyv in #1055
- fix: filter out remote::sample providers when listing by @booxter in #1057
- feat: Support tool calling for non-streaming chat completion in remote vLLM provider by @terrytangyuan in #1034
- perf: ensure ToolCall in ChatCompletionResponse is subset of ChatCompletionRequest.tools by @yanxi0830 in #1041
- chore: update return type to Optional[str] by @leseb in #982
- feat: Support tool calling for streaming chat completion in remote vLLM provider by @terrytangyuan in #1063
- fix: show proper help text by @cdoern in #1065
- feat: add support for running in a venv by @cdoern in #1018
- feat: Adding sqlite-vec as a vectordb by @franciscojavierarceo in #1040
- feat: support listing all for
llama stack list-providers
by @booxter in #1056 - docs: Mention convential commits format in CONTRIBUTING.md by @bbrowning in #1075
- fix: logprobs support in remote-vllm provider by @bbrowning in #1074
- fix: improve signal handling and update dependencies by @leseb in #1044
- style: update model id in model list title by @reidliu41 in #1072
- fix: make backslash work in GET /models/{model_id:path} by @yanxi0830 in #1068
- chore: Link to Groq docs in the warning message for preview model by @terrytangyuan in #1060
- fix: remove :path in agents by @yanxi0830 in #1077
- build: format codebase imports using ruff linter by @leseb in #1028
- chore: Consistent naming for VectorIO providers by @terrytangyuan in #1023
- test: Enable logprobs top_k tests for remote::vllm by @terrytangyuan in #1080
- docs: Fix url to the llama-stack-spec yaml/html files by @vishnoianil in #1081
- fix: Update VectorIO config classes in registry by @terrytangyuan in #1079
- test: Add qdrant to provider tests by @jwm4 in #1039
- test: add test for Agent.create_turn non-streaming response by @ehhuang in #1078
- fix!: update eval-tasks -> benchmarks by @yanxi0830 in #1032
- fix: openapi for eval-task by @yanxi0830 in #1085
- fix: regex pattern matching to support :path suffix in the routes by @hardikjshah in #1089
- fix: disable sqlite-vec test by @yanxi0830 in #1090
- fix: add the missed help description info by @reidliu41 in #1096
- fix: Update QdrantConfig to QdrantVectorIOConfig by @bbrowning in #1104
- docs: Add region parameter to Bedrock provider by @raghotham in #1103
- build: configure ruff from pyproject.toml by @leseb in #1100
- chore: move all Llama Stack types from llama-models to llama-stack by @ashwinb in #1098
- fix: enable_session_persistence in AgentConfig should be optional by @terrytangyuan in #1012
- fix: improve stack build on venv by @leseb in #980
- fix: remove the empty line by @reidliu41 in #1097
New Contributors
- @MichaelClifford made their first contribution in #1009
- @ellistarn made their first contribution in #1035
- @kelbrown20 made their first contribution in #992
- @franciscojavierarceo made their first contribution in #1040
- @bbrowning made their first contribution in #1075
- @reidliu41 made their first contribution in #1072
- @vishnoianil made their first contribution in #1081
Full Changelog: v0.1.2...v0.1.3