Skip to content

v0.1.4

Latest
Compare
Choose a tag to compare
@hardikjshah hardikjshah released this 25 Feb 00:02
· 7 commits to main since this release

v0.1.4 Release Notes

Here are the key changes coming as part of this release:

Build and Test Agents

  • Inference: Added support for non-llama models
  • Inference: Added option to list all downloaded models and remove models
  • Agent: Introduce new api agents.resume_turn to include client side tool execution in the same turn
  • Agent: AgentConfig introduces new variable “tool_config” that allows for better tool configuration and system prompt overrides
  • Agent: Added logging for agent step start and completion times
  • Agent: Added support for logging for tool execution metadata
  • Embedding: Updated /inference/embeddings to support asymmetric models, truncation and variable sized outputs
  • Embedding: Updated embedding models for Ollama, Together, and Fireworks with available defaults
  • VectorIO: Improved performance of sqlite-vec using chunked writes

Agent Evals and Model Customization

  • Deprecated api /eval-tasks. Use /eval/benchmark instead
  • Added CPU training support for TorchTune

Deploy and Monitoring of Agents

  • Consistent view of client and server tool calls in telemetry

Better Engineering

  • Made tests more data-driven for consistent evaluation
  • Fixed documentation links and improved API reference generation
  • Various small fixes for build scripts and system reliability

What's Changed

  • build: resync uv and deps on 0.1.3 by @leseb in #1108
  • style: fix the capitalization issue by @reidliu41 in #1117
  • feat: log start, complete time to Agent steps by @ehhuang in #1116
  • fix: Ensure a tool call can be converted before adding to buffer by @terrytangyuan in #1119
  • docs: Fix incorrect link and command for generating API reference by @terrytangyuan in #1124
  • chore: remove --no-list-templates option by @reidliu41 in #1121
  • style: update verify-download help text by @reidliu41 in #1134
  • style: update download help text by @reidliu41 in #1135
  • fix: modify the model id title for model list by @reidliu41 in #1095
  • fix: direct client pydantic type casting by @yanxi0830 in #1145
  • style: remove prints in codebase by @yanxi0830 in #1146
  • feat: support tool_choice = {required, none, } by @ehhuang in #1059
  • test: Enable test_text_chat_completion_with_tool_choice_required for remote::vllm by @terrytangyuan in #1148
  • fix(rag-example): add provider_id to avoid llama_stack_client 400 error by @fulvius31 in #1114
  • fix: Get distro_codegen.py working with default deps and enabled in pre-commit hooks by @bbrowning in #1123
  • chore: remove llama_models.llama3.api imports from providers by @ashwinb in #1107
  • docs: fix Python llama_stack_client SDK links by @leseb in #1150
  • feat: Chunk sqlite-vec writes by @franciscojavierarceo in #1094
  • fix: miscellaneous job management improvements in torchtune by @booxter in #1136
  • feat: add aggregation_functions to llm_as_judge_405b_simpleqa by @SLR722 in #1164
  • feat: inference passthrough provider by @SLR722 in #1166
  • docs: Remove unused python-openapi and json-strong-typing in openapi_generator by @terrytangyuan in #1167
  • docs: improve API contribution guidelines by @leseb in #1137
  • feat: add a option to list the downloaded models by @reidliu41 in #1127
  • fix: Fixing some small issues with the build scripts by @franciscojavierarceo in #1132
  • fix: llama stack build use UV_SYSTEM_PYTHON to install dependencies to system environment by @yanxi0830 in #1163
  • build: add missing dev dependencies for unit tests by @leseb in #1004
  • fix: More robust handling of the arguments in tool call response in remote::vllm by @terrytangyuan in #1169
  • Added support for mongoDB KV store by @shrinitg in #543
  • script for running client sdk tests by @sixianyi0721 in #895
  • test: skip model registration for unsupported providers by @leseb in #1030
  • feat: Enable CPU training for torchtune by @booxter in #1140
  • fix: add logging import by @raspawar in #1174
  • docs: Add note about distro_codegen.py and provider dependencies by @bbrowning in #1175
  • chore: slight renaming of model alias stuff by @ashwinb in #1181
  • feat: adding endpoints for files and uploads by @vladimirivic in #1070
  • docs: Fix Links, Add Podman Instructions, Vector DB Unregister, and Example Script by @kevincogan in #1129
  • chore!: deprecate eval/tasks by @yanxi0830 in #1186
  • fix: some telemetry APIs don't currently work by @ehhuang in #1188
  • feat: D69478008 [llama-stack] turning tests into data-driven by @LESSuseLESS in #1180
  • feat: register embedding models for ollama, together, fireworks by @ashwinb in #1190
  • feat(providers): add NVIDIA Inference embedding provider and tests by @mattf in #935
  • docs: Add missing uv command for docs generation in contributing guide by @terrytangyuan in #1197
  • docs: Simplify installation guide with uv by @terrytangyuan in #1196
  • fix: BuiltinTool JSON serialization in remote vLLM provider by @bbrowning in #1183
  • ci: improve GitHub Actions workflow for website builds by @leseb in #1151
  • fix: pass tool_prompt_format to chat_formatter by @ehhuang in #1198
  • fix(api): update embeddings signature so inputs and outputs list align by @ashwinb in #1161
  • feat(api): Add options for supporting various embedding models by @ashwinb in #1192
  • fix: update URL import, URL -> ImageContentItemImageURL by @mattf in #1204
  • feat: model remove cmd by @reidliu41 in #1128
  • chore: remove configure subcommand by @reidliu41 in #1202
  • fix: remove list of list tests, no longer relevant after #1161 by @mattf in #1205
  • test(client-sdk): Update embedding test types to use latest imports by @raspawar in #1203
  • fix: convert back to model descriptor for model in list --downloaded by @reidliu41 in #1201
  • docs: Add missing uv command and clarify website rebuild by @terrytangyuan in #1199
  • fix: Updating images so that they are able to run without root access by @jland-redhat in #1208
  • fix: pull ollama embedding model if necessary by @ashwinb in #1209
  • chore: move embedding deps to RAG tool where they are needed by @ashwinb in #1210
  • feat(1/n): api: unify agents for handling server & client tools by @yanxi0830 in #1178
  • feat: tool outputs metadata by @ehhuang in #1155
  • ci: add mypy for static type checking by @leseb in #1101
  • feat(providers): support non-llama models for inference providers by @ashwinb in #1200
  • test: fix test_rag_agent test by @ehhuang in #1215
  • feat: add substring search for model list by @reidliu41 in #1099
  • test: do not overwrite agent_config by @ehhuang in #1216
  • docs: Adding Provider sections to docs by @franciscojavierarceo in #1195
  • fix: update virtualenv building so llamastack- prefix is not added, make notebook experience easier by @ashwinb in #1225
  • feat: add --run to llama stack build by @cdoern in #1156
  • docs: Add vLLM to the list of inference providers in concepts and providers pages by @terrytangyuan in #1227
  • docs: small fixes by @reidliu41 in #1224
  • fix: avoid failure when no special pip deps and better exit by @leseb in #1228
  • fix: set default tool_prompt_format in inference api by @ehhuang in #1214
  • test: fix test_tool_choice by @ehhuang in #1234

New Contributors

Full Changelog: v0.1.3...v0.1.4