Releases: meta-llama/llama-stack
Releases · meta-llama/llama-stack
Llama Stack 0.0.54 Release
What's Changed
- Bugfixes release on top of 0.0.53
- Don't depend on templates.py when print llama stack build messages by @ashwinb in #496
- Restructure docs by @dineshyv in #494
- Since we are pushing for HF repos, we should accept them in inference configs by @ashwinb in #497
- Fix fp8 quantization script. by @liyunlu0618 in #500
- use logging instead of prints by @dineshyv in #499
New Contributors
- @liyunlu0618 made their first contribution in #500
Full Changelog: v0.0.53...v0.0.54
Llama Stack 0.0.53 Release
🚀 Initial Release Notes for Llama Stack!
Added
- Resource-oriented design for models, shields, memory banks, datasets and eval tasks
- Persistence for registered objects with distribution
- Ability to persist memory banks created for FAISS
- PostgreSQL KVStore implementation
- Environment variable placeholder support in run.yaml files
- Comprehensive Zero-to-Hero notebooks and quickstart guides
- Support for quantized models in Ollama
- Vision models support for Together, Fireworks, Meta-Reference, and Ollama, and vLLM
- Bedrock distribution with safety shields support
- Evals API with task registration and scoring functions
- MMLU and SimpleQA benchmark scoring functions
- Huggingface dataset provider integration for benchmarks
- Support for custom dataset registration from local paths
- Benchmark evaluation CLI tools with visualization tables
- RAG evaluation scoring functions and metrics
- Local persistence for datasets and eval tasks
Changed
- Split safety into distinct providers (llama-guard, prompt-guard, code-scanner)
- Changed provider naming convention (
impls
→inline
,adapters
→remote
) - Updated API signatures for dataset and eval task registration
- Restructured folder organization for providers
- Enhanced Docker build configuration
- Added version prefixing for REST API routes
- Enhanced evaluation task registration workflow
- Improved benchmark evaluation output formatting
- Restructured evals folder organization for better modularity
Removed
llama stack configure
command
What's Changed
- Update download command by @Wauplin in #9
- Update fbgemm version by @jianyuh in #12
- Add CLI reference docs by @dltn in #14
- Added Ollama as an inference impl by @hardikjshah in #20
- Hide older models by @dltn in #23
- Introduce Llama stack distributions by @ashwinb in #22
- Rename inline -> local by @dltn in #24
- Avoid using nearly double the memory needed by @ashwinb in #30
- Updates to prompt for tool calls by @hardikjshah in #29
- RFC-0001-The-Llama-Stack by @raghotham in #8
- Add API keys to AgenticSystemConfig instead of relying on dotenv by @ashwinb in #33
- update cli ref doc by @jeffxtang in #34
- fixed bug in download not enough disk space condition by @sisminnmaw in #35
- Updated cli instructions with additonal details for each subcommands by @varunfb in #36
- Updated URLs and addressed feedback by @varunfb in #37
- Fireworks basic integration by @benjibc in #39
- Together AI basic integration by @Nutlope in #43
- Update LICENSE by @raghotham in #47
- Add patch for SSE event endpoint responses by @dltn in #50
- API Updates: fleshing out RAG APIs, introduce "llama stack" CLI command by @ashwinb in #51
- [inference] Add a TGI adapter by @ashwinb in #52
- upgrade llama_models by @benjibc in #55
- Query generators for RAG query by @hardikjshah in #54
- Add Chroma and PGVector adapters by @ashwinb in #56
- API spec update, client demo with Stainless SDK by @yanxi0830 in #58
- Enable Bing search by @hardikjshah in #59
- add safety to openapi spec by @yanxi0830 in #62
- Add config file based CLI by @yanxi0830 in #60
- Simplified Telemetry API and tying it to logger by @ashwinb in #57
- [Inference] Use huggingface_hub inference client for TGI adapter by @hanouticelina in #53
- Support
data:
in URL for memory. Add ootb support for pdfs by @hardikjshah in #67 - Remove request wrapper migration by @yanxi0830 in #64
- CLI Update: build -> configure -> run by @yanxi0830 in #69
- API Updates by @ashwinb in #73
- Unwrap ChatCompletionRequest for context_retriever by @yanxi0830 in #75
- CLI - add back build wizard, configure with name instead of build.yaml by @yanxi0830 in #74
- CLI: add build templates support, move imports by @yanxi0830 in #77
- fix prompt with name args by @yanxi0830 in #80
- Fix memory URL parsing by @yanxi0830 in #81
- Allow TGI adaptor to have non-standard llama model names by @hardikjshah in #84
- [API Updates] Model / shield / memory-bank routing + agent persistence + support for private headers by @ashwinb in #92
- Bedrock Guardrails comiting after rebasing the fork by @rsgrewal-aws in #96
- Bedrock Inference Integration by @poegej in #94
- Support for Llama3.2 models and Swift SDK by @ashwinb in #98
- fix safety using inference by @yanxi0830 in #99
- Fixes typo for setup instruction for starting Llama Stack Server section by @abhishekmishragithub in #103
- Make TGI adapter compatible with HF Inference API by @Wauplin in #97
- Fix links & format by @machina-source in #104
- docs: fix typo by @dijonkitchen in #107
- LG safety fix by @kplawiak in #108
- Minor typos, HuggingFace -> Hugging Face by @marklysze in #113
- Reordered pip install and llama model download by @KarthiDreamr in #112
- Update getting_started.ipynb by @delvingdeep in #117
- fix: 404 link to agentic system repository by @moldhouse in #118
- Fix broken links in RFC-0001-llama-stack.md by @bhimrazy in #134
- Validate
name
inllama stack build
by @russellb in #128 - inference: Fix download command in error msg by @russellb in #133
- configure: Fix a error msg typo by @russellb in #131
- docs: Note how to use podman by @russellb in #130
- add env for LLAMA_STACK_CONFIG_DIR by @yanxi0830 in #137
- [bugfix] fix duplicate api endpoints by @yanxi0830 in #139
- Use inference APIs for executing Llama Guard by @ashwinb in #121
- fixing safety inference and safety adapter for new API spec. Pinned t… by @yogishbaliga in #105
- [CLI] remove dependency on CONDA_PREFIX in CLI by @yanxi0830 in #144
- [bugfix] fix #146 by @yanxi0830 in #147
- Extract provider data properly (attempt 2) by @ashwinb in #148
is_multimodal
acceptscore_model_id
not model itself. by @wizardbc in #153- fix broken bedrock inference provider by @moritalous in #151
- Fix podman+selinux compatibility by @russellb in #132
- docker: Install in editable mode for dev purposes by @russellb in #160
- [CLI] simplify docker run by @yanxi0830 in #159
- Add a RoutableProvider protocol, support for multiple routing keys by @ashwinb in #163
- docker: Check for selinux before using
--security-opt
by @russellb in #167 - Adds markdown-link-check and fixes a broken link by @codefromthecrypt in #165
- [bugfix] conda path lookup by @yanxi0830 in #179
- fix prompt guard by @ashwinb in #177
- inference: Add model option to client by @russellb in #17...