
- Shanghai, China
-
02:20
- 8h ahead
Starred repositories
LeaderWorkerSet: An API for deploying a group of pods as a unit of replication
The ultimate LLM/AI application development framework in Golang.
Ray is an AI compute engine. Ray consists of a core distributed runtime and a set of AI Libraries for accelerating ML workloads.
Finetune Llama 3.3, DeepSeek-R1 & Reasoning LLMs 2x faster with 70% less memory! 🦥
Kubernetes WithOut Kubelet - Simulates thousands of Nodes and Clusters.
Gateway API Inference Extension
PDF scientific paper translation with preserved formats - 基于 AI 完整保留排版的 PDF 文档全文双语翻译,支持 Google/DeepL/Ollama/OpenAI 等服务,提供 CLI/GUI/Docker/Zotero
Heterogeneous AI Computing Virtualization Middleware
🚀 A simple way to launch, train, and use PyTorch models on almost any device and distributed configuration, automatic mixed precision (including fp8), and easy-to-configure FSDP and DeepSpeed support
The TinyLlama project is an open endeavor to pretrain a 1.1B Llama model on 3 trillion tokens.
The official repo of Qwen (通义千问) chat & pretrained large language model proposed by Alibaba Cloud.
Llama中文社区,Llama3在线体验和微调模型已开放,实时汇总最新Llama3学习资料,已将所有代码更新适配Llama3,构建最好的中文Llama大模型,完全开源可商用
kubectl plugin to print Kubernetes resource conditions
JobSet: a k8s native API for distributed ML training and HPC workloads
A Cloud Native Batch System (Project under CNCF)
A toolkit to run Ray applications on Kubernetes
Comparison of Language Model Inference Engines
AI Inference Operator for Kubernetes. The easiest way to serve ML models in production. Supports VLMs, LLMs, embeddings, and speech-to-text.
User-friendly AI Interface (Supports Ollama, OpenAI API, ...)
A scalable generative AI framework built for researchers and developers working on Large Language Models, Multimodal, and Speech AI (Automatic Speech Recognition and Text-to-Speech)
Provides end-to-end model development pipelines for LLMs and Multimodal models that can be launched on-prem or cloud-native.
Accelerate local LLM inference and finetuning (LLaMA, Mistral, ChatGLM, Qwen, DeepSeek, Mixtral, Gemma, Phi, MiniCPM, Qwen-VL, MiniCPM-V, etc.) on Intel XPU (e.g., local PC with iGPU and NPU, discr…
Open source platform for the machine learning lifecycle