Stars
ViDoRAG: Visual Document Retrieval-Augmented Generation via Dynamic Iterative Reasoning Agents
OpenAI DeepResearch alternative, An AI-driven research system that performs comprehensive, iterative research on any topic using multiple search engines and LLMs.
21 Lessons, Get Started Building with Generative AI 🔗 https://microsoft.github.io/generative-ai-for-beginners/
Enjoy the magic of Diffusion models!
Official Repo for "TheoremExplainAgent: Towards Multimodal Explanations for LLM Theorem Understanding"
Wan: Open and Advanced Large-Scale Video Generative Models
Master programming by recreating your favorite technologies from scratch.
AigcPanel 是一个简单易用的一站式AI数字人系统,支持视频合成、声音合成、声音克隆,简化本地模型管理、一键导入和使用AI模型。
Model Context Protocol Servers
The official Python SDK for Model Context Protocol servers and clients
Make websites accessible for AI agents
本项目是一个面向小白开发者的大模型应用开发教程,在线阅读地址:https://datawhalechina.github.io/llm-universe/
🚀🚀 「大模型」2小时完全从0训练26M的小参数GPT!🌏 Train a 26M-parameter GPT from scratch in just 2h!
Drag & drop UI to build your customized LLM flow
A simple screen parsing tool towards pure vision based GUI agent
SkyReels-A1: Expressive Portrait Animation in Video Diffusion Transformers
SkyReels V1: The first and most advanced open-source human-centric video foundation model
🪄 Create rich visualizations with AI
an open source, extensible AI agent that goes beyond code suggestions - install, execute, edit, and test with any LLM
基于大模型搭建的聊天机器人,同时支持 微信公众号、企业微信应用、飞书、钉钉 等接入,可选择GPT3.5/GPT-4o/GPT-o1/ DeepSeek/Claude/文心一言/讯飞星火/通义千问/ Gemini/GLM-4/Claude/Kimi/LinkAI,能处理文本、语音和图片,访问操作系统和互联网,支持基于自有知识库进行定制企业智能客服。
A high-throughput and memory-efficient inference and serving engine for LLMs
Sky-T1: Train your own O1 preview model within $450
Janus-Series: Unified Multimodal Understanding and Generation Models
🤯 Lobe Chat - an open-source, modern-design AI chat framework. Supports Multi AI Providers( OpenAI / Claude 3 / Gemini / Ollama / DeepSeek / Qwen), Knowledge Base (file upload / knowledge managemen…
A GUI Agent application based on UI-TARS(Vision-Lanuage Model) that allows you to control your computer using natural language.