Release v0.1.0rc12 · meta-llama/llama-stack

What's Changed

[4/n][torchtune integration] support lazy load model during inference by @SLR722 in #620
remove unused telemetry related code for console by @dineshyv in #659
Fix Meta reference GPU implementation by @ashwinb in #663
Fixed imports for inference by @cdgamarose-nv in #661
fix trace starting in library client by @dineshyv in #655
Add Llama 70B 3.3 to fireworks by @aidando73 in #654
Tools API with brave and MCP providers by @dineshyv in #639
[torchtune integration] post training + eval by @SLR722 in #670
Fix post training apis broken by torchtune release by @SLR722 in #674
Add missing venv option in --image-type by @terrytangyuan in #677
Removed unnecessary CONDA_PREFIX env var in installation guide by @terrytangyuan in #683
Add 3.3 70B to Ollama inference provider by @aidando73 in #681
docs: update evals_reference/index.md by @eltociear in #675
[remove import ][1/n] clean up import & in apis/ by @yanxi0830 in #689
[bugfix] fix broken vision inference, change serialization for bytes by @yanxi0830 in #693
Minor Quick Start documentation updates. by @derekslager in #692
[bugfix] fix meta-reference agents w/ safety multiple model loading pytest by @yanxi0830 in #694
[bugfix] fix prompt_adapter interleaved_content_convert_to_raw by @yanxi0830 in #696
Add missing "inline::" prefix for providers in building_distro.md by @terrytangyuan in #702
Fix failing flake8 E226 check by @terrytangyuan in #701
Add missing newlines before printing the Dockerfile content by @terrytangyuan in #700
Add JSON structured outputs to Ollama Provider by @aidando73 in #680
[#407] Agents: Avoid calling tools that haven't been explicitly enabled by @aidando73 in #637
Made changes to readme and pinning to llamastack v0.0.61 by @heyjustinai in #624
[rag evals][1/n] refactor base scoring fn & data schema check by @yanxi0830 in #664
[Post Training] Fix missing import by @SLR722 in #705
Import from the right path by @SLR722 in #708
[#432] Add Groq Provider - chat completions by @aidando73 in #609
Change post training run.yaml inference config by @SLR722 in #710
[Post training] make validation steps configurable by @SLR722 in #715
Fix incorrect entrypoint for broken llama stack run by @terrytangyuan in #706
Fix assert message and call to completion_request_to_prompt in remote:vllm by @terrytangyuan in #709
Fix Groq invalid self.config reference by @aidando73 in #719
support llama3.1 8B instruct in post training by @SLR722 in #698
remove default logger handlers when using libcli with notebook by @dineshyv in #718
move DataSchemaValidatorMixin into standalone utils by @yanxi0830 in #720
add 3.3 to together inference provider by @yanxi0830 in #729
Update CODEOWNERS - add sixianyi0721 as the owner by @sixianyi0721 in #731
fix links for distro by @yanxi0830 in #733
add --version to llama stack CLI & /version endpoint by @yanxi0830 in #732
agents to use tools api by @dineshyv in #673
Add X-LlamaStack-Client-Version, rename ProviderData -> Provider-Data by @ashwinb in #735
Check version incompatibility by @ashwinb in #738
Add persistence for localfs datasets by @VladOS95-cyber in #557
Fixed typo in default VLLM_URL in remote-vllm.md by @terrytangyuan in #723
Consolidating Memory tests under client-sdk by @vladimirivic in #703
Expose LLAMASTACK_PORT in cli.stack.run by @terrytangyuan in #722
remove conflicting default for tool prompt format in chat completion by @dineshyv in #742
rename LLAMASTACK_PORT to LLAMA_STACK_PORT for consistency with other env vars by @raghotham in #744
Add inline vLLM inference provider to regression tests and fix regressions by @frreiss in #662
[CICD] github workflow to push nightly package to testpypi by @yanxi0830 in #734
Replaced zrangebylex method in the range method by @cheesecake100201 in #521
Improve model download doc by @SLR722 in #748
Support building UBI9 base container image by @terrytangyuan in #676
update notebook to use new tool defs by @dineshyv in #745
Add provider data passing for library client by @dineshyv in #750
[Fireworks] Update model name for Fireworks by @benjibc in #753
Consolidating Inference tests under client-sdk tests by @vladimirivic in #751
Consolidating Safety tests from various places under client-sdk by @vladimirivic in #699
[CI/CD] more robust re-try for downloading testpypi package by @yanxi0830 in #749
[#432] Add Groq Provider - tool calls by @aidando73 in #630
Rename ipython to tool by @ashwinb in #756
Fix incorrect Python binary path for UBI9 image by @terrytangyuan in #757
Update Cerebras docs to include header by @henrytwo in #704
Add init files to post training folders by @SLR722 in #711
Switch to use importlib instead of deprecated pkg_resources by @terrytangyuan in #678
[bugfix] fix streaming GeneratorExit exception with LlamaStackAsLibraryClient by @yanxi0830 in #760
Fix telemetry to work on reinstantiating new lib cli by @dineshyv in #761
[post training] define llama stack post training dataset format by @SLR722 in #717
add braintrust to experimental-post-training template by @SLR722 in #763
added support of PYPI_VERSION in stack build by @jeffxtang in #762
Fix broken tests in test_registry by @vladimirivic in #707
Fix fireworks run-with-safety template by @vladimirivic in #766
Free up memory after post training finishes by @SLR722 in #770
Fix issue when generating distros by @terrytangyuan in #755
Convert SamplingParams.strategy to a union by @hardikjshah in #767
[CICD] Github workflow for publishing Docker images by @yanxi0830 in #764
[bugfix] fix llama guard parsing ContentDelta by @yanxi0830 in #772
rebase eval test w/ tool_runtime fixtures by @yanxi0830 in #773
More idiomatic REST API by @dineshyv in #765
add nvidia distribution by @cdgamarose-nv in #565
bug fixes on inference tests by @sixianyi0721 in #774
[bugfix] fix inference sdk test for v1 by @yanxi0830 in #775
fix routing in library client by @dineshyv in #776
[bugfix] fix client-sdk tests for v1 by @yanxi0830 in #777
fix nvidia inference provider by @yanxi0830 in #781
Make notebook testable by @hardikjshah in #780
Fix telemetry by @dineshyv in #787
fireworks add completion logprobs adapter by @yanxi0830 in #778
Idiomatic REST API: Inspect by @dineshyv in #779
Idiomatic REST API: Evals by @dineshyv in #782
Add notebook testing to nightly build job by @hardikjshah in #785
[test automation] support run tests on config file by @sixianyi0721 in #730
Idiomatic REST API: Telemetry by @dineshyv in #786
Make llama stack build not create a new conda by default by @ashwinb in #788
REST API fixes by @dineshyv in #789
fix cerebras template by @yanxi0830 in #790
[Test automation] generate custom test report by @sixianyi0721 in #739
cerebras template update for memory by @yanxi0830 in #792
Pin torchtune pkg version by @SLR722 in #791
fix the code execution test in sdk tests by @dineshyv in #794
add default toolgroups to all providers by @dineshyv in #795
Fix tgi adapter by @yanxi0830 in #796
Remove llama-guard in Cerebras template & improve agent test by @yanxi0830 in #798
meta reference inference fixes by @ashwinb in #797
fix provider model list test by @hardikjshah in #800
fix playground for v1 by @yanxi0830 in #799
fix eval notebook & add test to workflow by @yanxi0830 in #803
add json_schema_type to ParamType deps by @dineshyv in #808
Fixing small typo in quick start guide by @pmccarthy in #807
cannot import name 'GreedySamplingStrategy' by @aidando73 in #806
optional api dependencies by @ashwinb in #793
fix vllm template by @yanxi0830 in #813
More generic image type for OCI-compliant container technologies by @terrytangyuan in #802
add mcp runtime as default to all providers by @dineshyv in #816
fix vllm base64 image inference by @yanxi0830 in #815
fix again vllm for non base64 by @yanxi0830 in #818
Fix incorrect RunConfigSettings due to the removal of conda_env by @terrytangyuan in #801
Fix incorrect image type in publish-to-docker workflow by @terrytangyuan in #819
test report for v0.1 by @sixianyi0721 in #814
[CICD] add simple test step for docker build workflow, fix prefix bug by @yanxi0830 in #821
add section for mcp tool usage in notebook by @dineshyv in #831
[ez] structured output for /completion ollama & enable tests by @sixianyi0721 in #822
add pytest option to generate a functional report for distribution by @sixianyi0721 in #833
bug fix for distro report generation by @sixianyi0721 in #836
[memory refactor][1/n] Rename Memory -> VectorIO, MemoryBanks -> VectorDBs by @ashwinb in #828
[memory refactor][2/n] Update faiss and make it pass tests by @ashwinb in #830
[memory refactor][3/n] Introduce RAGToolRuntime as a specialized sub-protocol by @ashwinb in #832
[memory refactor][4/n] Update the client-sdk test for RAG by @ashwinb in #834
[memory refactor][5/n] Migrate all vector_io providers by @ashwinb in #835
[memory refactor][6/n] Update naming and routes by @ashwinb in #839
Fix fireworks client sdk chat completion with images by @hardikjshah in #840
[inference api] modify content types so they follow a more standard structure by @ashwinb in #841

New Contributors

@cdgamarose-nv made their first contribution in #661
@eltociear made their first contribution in #675
@derekslager made their first contribution in #692
@VladOS95-cyber made their first contribution in #557
@frreiss made their first contribution in #662
@pmccarthy made their first contribution in #807

Full Changelog: v0.0.63...v0.1.0rc11

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

v0.1.0rc12

What's Changed

New Contributors

Contributors