v0.1.0rc12
Pre-release
Pre-release
What's Changed
- [4/n][torchtune integration] support lazy load model during inference by @SLR722 in #620
- remove unused telemetry related code for console by @dineshyv in #659
- Fix Meta reference GPU implementation by @ashwinb in #663
- Fixed imports for inference by @cdgamarose-nv in #661
- fix trace starting in library client by @dineshyv in #655
- Add Llama 70B 3.3 to fireworks by @aidando73 in #654
- Tools API with brave and MCP providers by @dineshyv in #639
- [torchtune integration] post training + eval by @SLR722 in #670
- Fix post training apis broken by torchtune release by @SLR722 in #674
- Add missing venv option in --image-type by @terrytangyuan in #677
- Removed unnecessary CONDA_PREFIX env var in installation guide by @terrytangyuan in #683
- Add 3.3 70B to Ollama inference provider by @aidando73 in #681
- docs: update evals_reference/index.md by @eltociear in #675
- [remove import ][1/n] clean up import & in apis/ by @yanxi0830 in #689
- [bugfix] fix broken vision inference, change serialization for bytes by @yanxi0830 in #693
- Minor Quick Start documentation updates. by @derekslager in #692
- [bugfix] fix meta-reference agents w/ safety multiple model loading pytest by @yanxi0830 in #694
- [bugfix] fix prompt_adapter interleaved_content_convert_to_raw by @yanxi0830 in #696
- Add missing "inline::" prefix for providers in building_distro.md by @terrytangyuan in #702
- Fix failing flake8 E226 check by @terrytangyuan in #701
- Add missing newlines before printing the Dockerfile content by @terrytangyuan in #700
- Add JSON structured outputs to Ollama Provider by @aidando73 in #680
- [#407] Agents: Avoid calling tools that haven't been explicitly enabled by @aidando73 in #637
- Made changes to readme and pinning to llamastack v0.0.61 by @heyjustinai in #624
- [rag evals][1/n] refactor base scoring fn & data schema check by @yanxi0830 in #664
- [Post Training] Fix missing import by @SLR722 in #705
- Import from the right path by @SLR722 in #708
- [#432] Add Groq Provider - chat completions by @aidando73 in #609
- Change post training run.yaml inference config by @SLR722 in #710
- [Post training] make validation steps configurable by @SLR722 in #715
- Fix incorrect entrypoint for broken
llama stack run
by @terrytangyuan in #706 - Fix assert message and call to completion_request_to_prompt in remote:vllm by @terrytangyuan in #709
- Fix Groq invalid self.config reference by @aidando73 in #719
- support llama3.1 8B instruct in post training by @SLR722 in #698
- remove default logger handlers when using libcli with notebook by @dineshyv in #718
- move DataSchemaValidatorMixin into standalone utils by @yanxi0830 in #720
- add 3.3 to together inference provider by @yanxi0830 in #729
- Update CODEOWNERS - add sixianyi0721 as the owner by @sixianyi0721 in #731
- fix links for distro by @yanxi0830 in #733
- add --version to llama stack CLI & /version endpoint by @yanxi0830 in #732
- agents to use tools api by @dineshyv in #673
- Add X-LlamaStack-Client-Version, rename ProviderData -> Provider-Data by @ashwinb in #735
- Check version incompatibility by @ashwinb in #738
- Add persistence for localfs datasets by @VladOS95-cyber in #557
- Fixed typo in default VLLM_URL in remote-vllm.md by @terrytangyuan in #723
- Consolidating Memory tests under client-sdk by @vladimirivic in #703
- Expose LLAMASTACK_PORT in cli.stack.run by @terrytangyuan in #722
- remove conflicting default for tool prompt format in chat completion by @dineshyv in #742
- rename LLAMASTACK_PORT to LLAMA_STACK_PORT for consistency with other env vars by @raghotham in #744
- Add inline vLLM inference provider to regression tests and fix regressions by @frreiss in #662
- [CICD] github workflow to push nightly package to testpypi by @yanxi0830 in #734
- Replaced zrangebylex method in the range method by @cheesecake100201 in #521
- Improve model download doc by @SLR722 in #748
- Support building UBI9 base container image by @terrytangyuan in #676
- update notebook to use new tool defs by @dineshyv in #745
- Add provider data passing for library client by @dineshyv in #750
- [Fireworks] Update model name for Fireworks by @benjibc in #753
- Consolidating Inference tests under client-sdk tests by @vladimirivic in #751
- Consolidating Safety tests from various places under client-sdk by @vladimirivic in #699
- [CI/CD] more robust re-try for downloading testpypi package by @yanxi0830 in #749
- [#432] Add Groq Provider - tool calls by @aidando73 in #630
- Rename ipython to tool by @ashwinb in #756
- Fix incorrect Python binary path for UBI9 image by @terrytangyuan in #757
- Update Cerebras docs to include header by @henrytwo in #704
- Add init files to post training folders by @SLR722 in #711
- Switch to use importlib instead of deprecated pkg_resources by @terrytangyuan in #678
- [bugfix] fix streaming GeneratorExit exception with LlamaStackAsLibraryClient by @yanxi0830 in #760
- Fix telemetry to work on reinstantiating new lib cli by @dineshyv in #761
- [post training] define llama stack post training dataset format by @SLR722 in #717
- add braintrust to experimental-post-training template by @SLR722 in #763
- added support of PYPI_VERSION in stack build by @jeffxtang in #762
- Fix broken tests in test_registry by @vladimirivic in #707
- Fix fireworks run-with-safety template by @vladimirivic in #766
- Free up memory after post training finishes by @SLR722 in #770
- Fix issue when generating distros by @terrytangyuan in #755
- Convert
SamplingParams.strategy
to a union by @hardikjshah in #767 - [CICD] Github workflow for publishing Docker images by @yanxi0830 in #764
- [bugfix] fix llama guard parsing ContentDelta by @yanxi0830 in #772
- rebase eval test w/ tool_runtime fixtures by @yanxi0830 in #773
- More idiomatic REST API by @dineshyv in #765
- add nvidia distribution by @cdgamarose-nv in #565
- bug fixes on inference tests by @sixianyi0721 in #774
- [bugfix] fix inference sdk test for v1 by @yanxi0830 in #775
- fix routing in library client by @dineshyv in #776
- [bugfix] fix client-sdk tests for v1 by @yanxi0830 in #777
- fix nvidia inference provider by @yanxi0830 in #781
- Make notebook testable by @hardikjshah in #780
- Fix telemetry by @dineshyv in #787
- fireworks add completion logprobs adapter by @yanxi0830 in #778
- Idiomatic REST API: Inspect by @dineshyv in #779
- Idiomatic REST API: Evals by @dineshyv in #782
- Add notebook testing to nightly build job by @hardikjshah in #785
- [test automation] support run tests on config file by @sixianyi0721 in #730
- Idiomatic REST API: Telemetry by @dineshyv in #786
- Make llama stack build not create a new conda by default by @ashwinb in #788
- REST API fixes by @dineshyv in #789
- fix cerebras template by @yanxi0830 in #790
- [Test automation] generate custom test report by @sixianyi0721 in #739
- cerebras template update for memory by @yanxi0830 in #792
- Pin torchtune pkg version by @SLR722 in #791
- fix the code execution test in sdk tests by @dineshyv in #794
- add default toolgroups to all providers by @dineshyv in #795
- Fix tgi adapter by @yanxi0830 in #796
- Remove llama-guard in Cerebras template & improve agent test by @yanxi0830 in #798
- meta reference inference fixes by @ashwinb in #797
- fix provider model list test by @hardikjshah in #800
- fix playground for v1 by @yanxi0830 in #799
- fix eval notebook & add test to workflow by @yanxi0830 in #803
- add json_schema_type to ParamType deps by @dineshyv in #808
- Fixing small typo in quick start guide by @pmccarthy in #807
- cannot import name 'GreedySamplingStrategy' by @aidando73 in #806
- optional api dependencies by @ashwinb in #793
- fix vllm template by @yanxi0830 in #813
- More generic image type for OCI-compliant container technologies by @terrytangyuan in #802
- add mcp runtime as default to all providers by @dineshyv in #816
- fix vllm base64 image inference by @yanxi0830 in #815
- fix again vllm for non base64 by @yanxi0830 in #818
- Fix incorrect RunConfigSettings due to the removal of conda_env by @terrytangyuan in #801
- Fix incorrect image type in publish-to-docker workflow by @terrytangyuan in #819
- test report for v0.1 by @sixianyi0721 in #814
- [CICD] add simple test step for docker build workflow, fix prefix bug by @yanxi0830 in #821
- add section for mcp tool usage in notebook by @dineshyv in #831
- [ez] structured output for /completion ollama & enable tests by @sixianyi0721 in #822
- add pytest option to generate a functional report for distribution by @sixianyi0721 in #833
- bug fix for distro report generation by @sixianyi0721 in #836
- [memory refactor][1/n] Rename Memory -> VectorIO, MemoryBanks -> VectorDBs by @ashwinb in #828
- [memory refactor][2/n] Update faiss and make it pass tests by @ashwinb in #830
- [memory refactor][3/n] Introduce RAGToolRuntime as a specialized sub-protocol by @ashwinb in #832
- [memory refactor][4/n] Update the client-sdk test for RAG by @ashwinb in #834
- [memory refactor][5/n] Migrate all vector_io providers by @ashwinb in #835
- [memory refactor][6/n] Update naming and routes by @ashwinb in #839
- Fix fireworks client sdk chat completion with images by @hardikjshah in #840
- [inference api] modify content types so they follow a more standard structure by @ashwinb in #841
New Contributors
- @cdgamarose-nv made their first contribution in #661
- @eltociear made their first contribution in #675
- @derekslager made their first contribution in #692
- @VladOS95-cyber made their first contribution in #557
- @frreiss made their first contribution in #662
- @pmccarthy made their first contribution in #807
Full Changelog: v0.0.63...v0.1.0rc11