Skip to content

๐Ÿ Hand-picked awesome Python libraries and frameworks, organised by category

License

Notifications You must be signed in to change notification settings

dylanhogg/awesome-python

Folders and files

NameName
Last commit message
Last commit date

Latest commit

ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 

Repository files navigation

Awesome Python

Awesome Last commit License: MIT

Hand-picked awesome Python libraries and frameworks, organised by category ๐Ÿ

Interactive version: www.awesomepython.org

Updated 06 Feb 2025

Categories

  • Newly Created Repositories - Awesome Python is regularly updated, and this category lists the most recently created GitHub repositories from all the other repositories here (10 repos)
  • Agentic AI - Agentic AI libraries, frameworks and tools: AI agents, workflows, autonomous decision-making, goal-oriented tasks, and API integrations (51 repos)
  • Code Quality - Code quality tooling: linters, formatters, pre-commit hooks, unused code removal (16 repos)
  • Crypto and Blockchain - Cryptocurrency and blockchain libraries: trading bots, API integration, Ethereum virtual machine, solidity (14 repos)
  • Data - General data libraries: data processing, serialisation, formats, databases, SQL, connectors, web crawlers, data generation/augmentation/checks (115 repos)
  • Debugging - Debugging and tracing tools (10 repos)
  • Diffusion Text to Image - Text-to-image diffusion model libraries, tools and apps for generating images from natural language (42 repos)
  • Finance - Financial and quantitative libraries: investment research tools, market data, algorithmic trading, backtesting, financial derivatives (34 repos)
  • Game Development - Game development tools, engines and libraries (8 repos)
  • GIS - Geospatial libraries: raster and vector data formats, interactive mapping and visualisation, computing frameworks for processing images, projections (28 repos)
  • Graph - Graphs and network libraries: network analysis, graph machine learning, visualisation (6 repos)
  • GUI - Graphical user interface libraries and toolkits (8 repos)
  • Jupyter - Jupyter and JupyterLab and Notebook tools, libraries and plugins (27 repos)
  • LLMs and ChatGPT - Large language model and GPT libraries and frameworks: auto-gpt, agents, QnA, chain-of-thought workflows, API integations. Also see the Natural Language Processing category for crossover (310 repos)
  • Math and Science - Mathematical, numerical and scientific libraries (30 repos)
  • Machine Learning - General - General and classical machine learning libraries. See below for other sections covering specialised ML areas (160 repos)
  • Machine Learning - Deep Learning - Machine learning libraries that cross over with deep learning in some way (79 repos)
  • Machine Learning - Interpretability - Machine learning interpretability libraries. Covers explainability, prediction explainations, dashboards, understanding knowledge development in training (22 repos)
  • Machine Learning - Ops - MLOps tools, frameworks and libraries: intersection of machine learning, data engineering and DevOps; deployment, health, diagnostics and governance of ML models (43 repos)
  • Machine Learning - Reinforcement - Machine learning libraries and toolkits that cross over with reinforcement learning in some way: agent reinforcement learning, agent environemnts, RLHF (23 repos)
  • Machine Learning - Time Series - Machine learning and classical timeseries libraries: forecasting, seasonality, anomaly detection, econometrics (19 repos)
  • Natural Language Processing - Natural language processing libraries and toolkits: text processing, topic modelling, tokenisers, chatbots. Also see the LLMs and ChatGPT category for crossover (87 repos)
  • Packaging - Python packaging, dependency management and bundling (28 repos)
  • Pandas - Pandas and dataframe libraries: data analysis, statistical reporting, pandas GUIs, pandas performance optimisations (24 repos)
  • Performance - Performance, parallelisation and low level libraries (28 repos)
  • Profiling - Memory and CPU/GPU profiling tools and libraries (11 repos)
  • Security - Security related libraries: vulnerability discovery, SQL injection, environment auditing (14 repos)
  • Simulation - Simulation libraries: robotics, economic, agent-based, traffic, physics, astronomy, chemistry, quantum simulation. Also see the Maths and Science category for crossover (37 repos)
  • Study - Miscellaneous study resources: algorithms, general resources, system design, code repos for textbooks, best practices, tutorials (60 repos)
  • Template - Template tools and libraries: cookiecutter repos, generators, quick-starts (10 repos)
  • Terminal - Terminal and console tools and libraries: CLI tools, terminal based formatters, progress bars (15 repos)
  • Testing - Testing libraries: unit testing, load testing, acceptance testing, code coverage, browser automation, plugins (24 repos)
  • Typing - Typing libraries: static and run-time type checking, annotations (12 repos)
  • Utility - General utility libraries: miscellaneous tools, linters, code formatters, version management, package tools, documentation tools (210 repos)
  • Vizualisation - Vizualisation tools and libraries. Application frameworks, 2D/3D plotting, dashboards, WebGL (36 repos)
  • Web - Web related frameworks and libraries: webapp servers, WSGI, ASGI, asyncio, HTTP, REST, user management (58 repos)

Newly Created Repositories

Awesome Python is regularly updated, and this category lists the most recently created GitHub repositories from all the other repositories here.

  1. deepseek-ai/DeepSeek-V3 โญ 72,238
    A strong Mixture-of-Experts (MoE) language model with 671B total parameters with 37B activated for each token.

  2. huggingface/open-r1 โญ 15,441
    The goal of this repo is to build the missing pieces of the R1 pipeline such that everybody can reproduce and build on top of it

  3. jiayi-pan/TinyZero โญ 7,718
    TinyZero is a reproduction of DeepSeek R1 Zero in countdown and multiplication tasks.

  4. nvidia/Cosmos โญ 7,364
    NVIDIA Cosmos is a developer-first world foundation model platform designed to help Physical AI developers build their Physical AI systems better and faster.

  5. novasky-ai/SkyThought โญ 2,353
    Sky-T1: Train your own O1 preview model within $450
    ๐Ÿ”— novasky-ai.github.io

  6. langchain-ai/executive-ai-assistant โญ 1,487
    Executive AI Assistant (EAIA) is an AI agent that attempts to do the job of an Executive Assistant (EA).

  7. deep-agent/R1-V โญ 1,481
    We are building a general framework for Reinforcement Learning with Verifiable Rewards (RLVR) in VLM. RLVR outperforms chain-of-thought supervised fine-tuning (CoT-SFT) in both effectiveness and out-of-distribution (OOD) robustness for vision language models.

  8. thytu/Agentarium โญ 853
    Framework for managing and orchestrating AI agents with ease. Agentarium provides a flexible and intuitive way to create, manage, and coordinate interactions between multiple AI agents in various environments.

  9. developersdigest/llm-api-engine โญ 622
    Build and deploy AI-powered APIs in seconds. This project allows you to create custom APIs that extract structured data from websites using natural language descriptions, powered by LLMs and web scraping technology.
    ๐Ÿ”— www.youtube.com/watch?v=8kuek1bo4mm

  10. whitead/paper-qa โญ 3
    High accuracy RAG for answering questions from scientific documents with citations

Agentic AI

Agentic AI libraries, frameworks and tools: AI agents, workflows, autonomous decision-making, goal-oriented tasks, and API integrations.

  1. langchain-ai/langchain โญ 99,521
    ๐Ÿฆœ๐Ÿ”— Build context-aware reasoning applications
    ๐Ÿ”— python.langchain.com

  2. langgenius/dify โญ 61,597
    Dify is an open-source LLM app development platform. Dify's intuitive interface combines AI workflow, RAG pipeline, agent capabilities, model management, observability features and more, letting you quickly go from prototype to production.
    ๐Ÿ”— dify.ai

  3. logspace-ai/langflow โญ 46,385
    Langflow is a low-code app builder for RAG and multi-agent AI applications. Itโ€™s Python-based and agnostic to any model, API, or database.
    ๐Ÿ”— www.langflow.org

  4. microsoft/autogen โญ 38,740
    A programming framework for agentic AI ๐Ÿค– PyPi: autogen-agentchat Discord: https://aka.ms/autogen-discord Office Hour: https://aka.ms/autogen-officehour
    ๐Ÿ”— microsoft.github.io/autogen

  5. run-llama/llama_index โญ 38,553
    LlamaIndex is the leading framework for building LLM-powered agents over your data.
    ๐Ÿ”— docs.llamaindex.ai

  6. openbmb/ChatDev โญ 26,053
    ChatDev stands as a virtual software company that operates through various intelligent agents holding different roles, including Chief Executive Officer, Chief Product Officer etc
    ๐Ÿ”— arxiv.org/abs/2307.07924

  7. joaomdmoura/crewAI โญ 25,897
    Framework for orchestrating role-playing, autonomous AI agents. By fostering collaborative intelligence, CrewAI empowers agents to work together seamlessly, tackling complex tasks.
    ๐Ÿ”— crewai.com

  8. browser-use/browser-use โญ 23,992
    Browser use is the easiest way to connect your AI agents with the browser.
    ๐Ÿ”— browser-use.com

  9. stanford-oval/storm โญ 21,553
    An LLM-powered knowledge curation system that researches a topic and generates a full-length report with citations.
    ๐Ÿ”— storm.genie.stanford.edu

  10. yoheinakajima/babyagi โญ 20,920
    GPT-4 powered task-driven autonomous agent
    ๐Ÿ”— babyagi.org

  11. phidatahq/agno โญ 18,534
    Phidata is a toolkit for building AI Assistants using function calling.
    ๐Ÿ”— docs.agno.com

  12. openai/swarm โญ 18,450
    A framework exploring ergonomic, lightweight multi-agent orchestration.

  13. unity-technologies/ml-agents โญ 17,548
    The Unity Machine Learning Agents Toolkit (ML-Agents) is an open-source project that enables games and simulations to serve as environments for training intelligent agents using deep reinforcement learning and imitation learning.
    ๐Ÿ”— unity.com/products/machine-learning-agents

  14. assafelovic/gpt-researcher โญ 16,326
    LLM based autonomous agent that conducts local and web research on any topic and generates a comprehensive report with citations.
    ๐Ÿ”— gptr.dev

  15. letta-ai/letta โญ 14,316
    Letta (formerly MemGPT) is a framework for creating LLM services with memory.
    ๐Ÿ”— docs.letta.com

  16. smol-ai/developer โญ 11,866
    the first library to let you embed a developer agent in your own app!
    ๐Ÿ”— twitter.com/smolmodels

  17. sakanaai/AI-Scientist โญ 8,832
    The AI Scientist, the first comprehensive system for fully automatic scientific discovery, enabling Foundation Models such as Large Language Models (LLMs) to perform research independently.

  18. langchain-ai/langgraph โญ 8,646
    LangGraph is a library for building stateful, multi-actor applications with LLMs, built on top of (and intended to be used with) LangChain.
    ๐Ÿ”— langchain-ai.github.io/langgraph

  19. meta-llama/llama-stack โญ 7,140
    Llama Stack standardizes the building blocks needed to bring genai applications to market. These blocks cover model training and fine-tuning, evaluation, and running AI agents in production

  20. huggingface/smolagents โญ 6,649
    ๐Ÿค— smolagents: a barebones library for agents. Agents write python code to call tools and orchestrate other agents.
    ๐Ÿ”— huggingface.co/docs/smolagents

  21. pydantic/pydantic-ai โญ 6,023
    PydanticAI is a Python Agent Framework designed to make it less painful to build production grade applications with Generative AI.
    ๐Ÿ”— ai.pydantic.dev

  22. nirdiamant/GenAI_Agents โญ 5,830
    Tutorials and implementations for various Generative AI Agent techniques, from basic to advanced. It serves as a comprehensive guide for building intelligent, interactive AI systems.

  23. prefecthq/marvin โญ 5,452
    โœจ AI agents that spark joy
    ๐Ÿ”— askmarvin.ai

  24. mnotgod96/AppAgent โญ 5,432
    AppAgent: Multimodal Agents as Smartphone Users, an LLM-based multimodal agent framework designed to operate smartphone apps.
    ๐Ÿ”— appagent-official.github.io

  25. kyegomez/swarms โญ 4,428
    The Enterprise-Grade Production-Ready Multi-Agent Orchestration Framework. Website: https://swarms.ai
    ๐Ÿ”— docs.swarms.world

  26. meta-llama/llama-stack-apps โญ 4,107
    Agentic components of the Llama Stack APIs

  27. crewaiinc/crewAI-examples โญ 3,569
    A collection of examples that show how to use CrewAI framework to automate workflows.

  28. langroid/langroid โญ 3,005
    Harness LLMs with Multi-Agent Programming
    ๐Ÿ”— langroid.github.io/langroid

  29. facebookresearch/Pearl โญ 2,758
    A Production-ready Reinforcement Learning AI Agent Library brought by the Applied Reinforcement Learning team at Meta.

  30. brainblend-ai/atomic-agents โญ 2,433
    Atomic Agents provides a set of tools and agents that can be combined to create powerful applications. It is built on top of Instructor and leverages the power of Pydantic for data and schema validation and serialization.

  31. griptape-ai/griptape โญ 2,156
    Modular Python framework for AI agents and workflows with chain-of-thought reasoning, tools, and memory.
    ๐Ÿ”— www.griptape.ai

  32. joshuac215/agent-service-toolkit โญ 2,033
    A full toolkit for running an AI agent service built with LangGraph, FastAPI and Streamlit.
    ๐Ÿ”— agent-service-toolkit.streamlit.app

  33. run-llama/llama_deploy โญ 1,931
    Async-first framework for deploying, scaling, and productionizing agentic multi-service systems based on workflows from llama_index.
    ๐Ÿ”— docs.llamaindex.ai/en/stable/module_guides/llama_deploy

  34. landing-ai/vision-agent โญ 1,789
    VisionAgent is a library that helps you utilize agent frameworks to generate code to solve your vision task

  35. om-ai-lab/OmAgent โญ 1,537
    OmAgent is python library for building multimodal language agents with ease. We try to keep the library simple without too much overhead like other agent framework.
    ๐Ÿ”— om-agent.com

  36. langchain-ai/executive-ai-assistant โญ 1,487
    Executive AI Assistant (EAIA) is an AI agent that attempts to do the job of an Executive Assistant (EA).

  37. openautocoder/Agentless โญ 1,386
    Agentless๐Ÿฑ: an agentless approach to automatically solve software development problems

  38. pyspur-dev/pyspur โญ 1,369
    Minimalist Graph Editor for AI Agents
    ๐Ÿ”— pyspur.dev

  39. link-agi/AutoAgents โญ 1,270
    [IJCAI 2024] Generate different roles for GPTs to form a collaborative entity for complex tasks.
    ๐Ÿ”— huggingface.co/spaces/linksoul/autoagents

  40. emcie-co/parlant โญ 1,181
    The heavy-duty guidance framework for customer-facing LLM agents
    ๐Ÿ”— www.parlant.io

  41. shengranhu/ADAS โญ 1,157
    Automated Design of Agentic Systems using Meta Agent Search to show agents can invent novel and powerful agent designs
    ๐Ÿ”— www.shengranhu.com/adas

  42. prefecthq/ControlFlow โญ 1,118
    ControlFlow provides a structured, developer-focused framework for defining workflows and delegating work to LLMs, without sacrificing control or transparency
    ๐Ÿ”— controlflow.ai

  43. thytu/Agentarium โญ 853
    Framework for managing and orchestrating AI agents with ease. Agentarium provides a flexible and intuitive way to create, manage, and coordinate interactions between multiple AI agents in various environments.

  44. victordibia/autogen-ui โญ 838
    Web UI for AutoGen (A Framework Multi-Agent LLM Applications)

  45. szczyglis-dev/py-gpt โญ 828
    Desktop AI Assistant powered by o1, o3-mini, GPT-4, GPT-4 Vision, Gemini, Claude, Llama 3, DeepSeek, Bielik, DALL-E, chat, vision, voice control, image generation and analysis, agents, command execution, file upload/download, speech synthesis and recognition, access to Web, memory, presets, assistants, plugins, and...
    ๐Ÿ”— pygpt.net

  46. google-deepmind/concordia โญ 759
    Concordia is a library to facilitate construction and use of generative agent-based models to simulate interactions of agents in grounded physical, social, or digital space.

  47. deedy/mac_computer_use โญ 734
    A fork of Anthropic Computer Use that you can run on Mac computers to give Claude and other AI models autonomous access to your computer.
    ๐Ÿ”— x.com/deedydas/status/1849481225041559910

  48. thudm/CogAgent โญ 658
    An open-sourced end-to-end VLM-based GUI Agent

  49. strnad/CrewAI-Studio โญ 572
    A user-friendly, multi-platform GUI for managing and running CrewAI agents and tasks. Supports Conda and virtual environments, no coding needed.

  50. salesforceairesearch/AgentLite โญ 568
    AgentLite is a research-oriented library designed for building and advancing LLM-based task-oriented agent systems. It simplifies the implementation of new agent/multi-agent architectures, enabling easy orchestration of multiple agents through a manager agent.

  51. plurai-ai/intellagent โญ 507
    Simulate interactions, analyze performance, and gain actionable insights for conversational agents. Test, evaluate, and optimize your agent to ensure reliable real-world deployment.
    ๐Ÿ”— intellagent-doc.plurai.ai

Code Quality

Code quality tooling: linters, formatters, pre-commit hooks, unused code removal.

  1. psf/black โญ 39,517
    The uncompromising Python code formatter
    ๐Ÿ”— black.readthedocs.io/en/stable

  2. astral-sh/ruff โญ 35,419
    An extremely fast Python linter and code formatter, written in Rust.
    ๐Ÿ”— docs.astral.sh/ruff

  3. google/yapf โญ 13,840
    A formatter for Python files

  4. pre-commit/pre-commit โญ 13,310
    A framework for managing and maintaining multi-language pre-commit hooks.
    ๐Ÿ”— pre-commit.com

  5. sqlfluff/sqlfluff โญ 8,534
    A modular SQL linter and auto-formatter with support for multiple dialects and templated code.
    ๐Ÿ”— www.sqlfluff.com

  6. pycqa/isort โญ 6,593
    A Python utility / library to sort imports.
    ๐Ÿ”— pycqa.github.io/isort

  7. davidhalter/jedi โญ 5,864
    Awesome autocompletion, static analysis and refactoring library for python
    ๐Ÿ”— jedi.readthedocs.io

  8. pycqa/pylint โญ 5,385
    It's not just a linter that annoys you!
    ๐Ÿ”— pylint.readthedocs.io/en/latest

  9. asottile/pyupgrade โญ 3,677
    A tool (and pre-commit hook) to automatically upgrade syntax for newer versions of the language.

  10. jendrikseipp/vulture โญ 3,653
    Find dead Python code

  11. pycqa/flake8 โญ 3,525
    flake8 is a python tool that glues together pycodestyle, pyflakes, mccabe, and third-party plugins to check the style and quality of some python code.
    ๐Ÿ”— flake8.pycqa.org

  12. wemake-services/wemake-python-styleguide โญ 2,650
    The strictest and most opinionated python linter ever!
    ๐Ÿ”— wemake-python-styleguide.rtfd.io

  13. python-lsp/python-lsp-server โญ 2,050
    Fork of the python-language-server project, maintained by the Spyder IDE team and the community

  14. codespell-project/codespell โญ 2,002
    check code for common misspellings

  15. sourcery-ai/sourcery โญ 1,592
    Instant AI code reviews
    ๐Ÿ”— sourcery.ai

  16. tconbeer/sqlfmt โญ 428
    sqlfmt formats your dbt SQL files so you don't have to
    ๐Ÿ”— sqlfmt.com

Crypto and Blockchain

Cryptocurrency and blockchain libraries: trading bots, API integration, Ethereum virtual machine, solidity.

  1. freqtrade/freqtrade โญ 35,834
    Free, open source crypto trading bot
    ๐Ÿ”— www.freqtrade.io

  2. ccxt/ccxt โญ 34,373
    A JavaScript / TypeScript / Python / C# / PHP / Go cryptocurrency trading API with support for more than 100 bitcoin/altcoin exchanges
    ๐Ÿ”— docs.ccxt.com

  3. crytic/slither โญ 5,472
    Static Analyzer for Solidity and Vyper
    ๐Ÿ”— blog.trailofbits.com/2018/10/19/slither-a-solidity-static-analysis-framework

  4. ethereum/web3.py โญ 5,135
    A python interface for interacting with the Ethereum blockchain and ecosystem.
    ๐Ÿ”— web3py.readthedocs.io

  5. ethereum/consensus-specs โญ 3,626
    Ethereum Proof-of-Stake Consensus Specifications

  6. cyberpunkmetalhead/Binance-volatility-trading-bot โญ 3,438
    This is a fully functioning Binance trading bot that measures the volatility of every coin on Binance and places trades with the highest gaining coins If you like this project consider donating though the Brave browser to allow me to continuously improve the script.

  7. bmoscon/cryptofeed โญ 2,333
    Cryptocurrency Exchange Websocket Data Feed Handler

  8. ethereum/py-evm โญ 2,298
    A Python implementation of the Ethereum Virtual Machine
    ๐Ÿ”— py-evm.readthedocs.io/en/latest

  9. binance/binance-public-data โญ 1,701
    Details on how to get Binance public data

  10. ofek/bit โญ 1,269
    Bitcoin made easy.
    ๐Ÿ”— ofek.dev/bit

  11. man-c/pycoingecko โญ 1,063
    Python wrapper for the CoinGecko API

  12. palkeo/panoramix โญ 828
    Ethereum decompiler

  13. coinbase/agentkit โญ 451
    AgentKit is Coinbase Developer Platform's framework for easily enabling AI agents to take actions onchain. It is designed to be framework-agnostic, so you can use it with any AI framework, and wallet-agnostic

  14. dylanhogg/awesome-crypto โญ 73
    A list of awesome crypto and blockchain projects
    ๐Ÿ”— www.awesomecrypto.xyz

Data

General data libraries: data processing, serialisation, formats, databases, SQL, connectors, web crawlers, data generation/augmentation/checks.

  1. scrapy/scrapy โญ 54,011
    Scrapy, a fast high-level web crawling & scraping framework for Python.
    ๐Ÿ”— scrapy.org

  2. apache/spark โญ 40,453
    Apache Spark - A unified analytics engine for large-scale data processing
    ๐Ÿ”— spark.apache.org

  3. microsoft/markitdown โญ 36,143
    A utility for converting files to Markdown, supports: PDF, PPT, Word, Excel, Images etc

  4. mindsdb/mindsdb โญ 27,140
    AGI's query engine - Platform for building AI that can learn and answer questions over federated data.
    ๐Ÿ”— mindsdb.com

  5. getredash/redash โญ 26,834
    Make Your Company Data Driven. Connect to any data source, easily visualize, dashboard and share your data.
    ๐Ÿ”— redash.io

  6. jaidedai/EasyOCR โญ 25,408
    Ready-to-use OCR with 80+ supported languages and all popular writing scripts including Latin, Chinese, Arabic, Devanagari, Cyrillic and etc.
    ๐Ÿ”— www.jaided.ai

  7. qdrant/qdrant โญ 21,607
    Qdrant - High-performance, massive-scale Vector Database and Vector Search Engine for the next generation of AI. Also available in the cloud https://cloud.qdrant.io/
    ๐Ÿ”— qdrant.tech

  8. humansignal/label-studio โญ 20,567
    Label Studio is an open source data labeling tool. It lets you label data types like audio, text, images, videos, and time series with a simple and straightforward UI and export to various model formats.
    ๐Ÿ”— labelstud.io

  9. ds4sd/docling โญ 19,796
    Docling parses documents and exports them to the desired format with ease and speed.
    ๐Ÿ”— ds4sd.github.io/docling

  10. joke2k/faker โญ 17,993
    Faker is a Python package that generates fake data for you.
    ๐Ÿ”— faker.readthedocs.io

  11. avaiga/taipy โญ 17,758
    Turns Data and AI algorithms into production-ready web applications in no time.
    ๐Ÿ”— www.taipy.io

  12. chroma-core/chroma โญ 17,386
    the AI-native open-source embedding database
    ๐Ÿ”— www.trychroma.com

  13. airbytehq/airbyte โญ 17,118
    The leading data integration platform for ETL / ELT data pipelines from APIs, databases & files to data warehouses, data lakes & data lakehouses. Both self-hosted and Cloud-hosted.
    ๐Ÿ”— airbyte.com

  14. binux/pyspider โญ 16,540
    A Powerful Spider(Web Crawler) System in Python.
    ๐Ÿ”— docs.pyspider.org

  15. twintproject/twint โญ 15,930
    An advanced Twitter scraping & OSINT tool written in Python that doesn't use Twitter's API, allowing you to scrape a user's followers, following, Tweets and more while evading most API limitations.

  16. tiangolo/sqlmodel โญ 15,125
    SQL databases in Python, designed for simplicity, compatibility, and robustness.
    ๐Ÿ”— sqlmodel.tiangolo.com

  17. apache/arrow โญ 14,949
    Apache Arrow is the universal columnar format and multi-language toolbox for fast data interchange and in-memory analytics
    ๐Ÿ”— arrow.apache.org

  18. pathwaycom/pathway โญ 13,493
    Python ETL framework for stream processing, real-time analytics, LLM pipelines, and RAG.
    ๐Ÿ”— pathway.com

  19. redis/redis-py โญ 12,828
    Redis Python client

  20. weaviate/weaviate โญ 12,208
    Weaviate is an open-source vector database that stores both objects and vectors, allowing for the combination of vector search with structured filtering with the fault tolerance and scalability of a cloud-native databaseโ€‹.
    ๐Ÿ”— weaviate.io/developers/weaviate

  21. coleifer/peewee โญ 11,344
    a small, expressive orm -- supports postgresql, mysql, sqlite and cockroachdb
    ๐Ÿ”— docs.peewee-orm.com

  22. s0md3v/Photon โญ 11,291
    Incredibly fast crawler designed for OSINT.

  23. sqlalchemy/sqlalchemy โญ 9,977
    The Database Toolkit for Python
    ๐Ÿ”— www.sqlalchemy.org

  24. simonw/datasette โญ 9,776
    An open source multi-tool for exploring and publishing data
    ๐Ÿ”— datasette.io

  25. bigscience-workshop/petals โญ 9,395
    ๐ŸŒธ Run LLMs at home, BitTorrent-style. Fine-tuning and inference up to 10x faster than offloading
    ๐Ÿ”— petals.dev

  26. voxel51/fiftyone โญ 9,132
    Refine high-quality datasets and visual AI models
    ๐Ÿ”— fiftyone.ai

  27. yzhao062/pyod โญ 8,781
    A Python Library for Outlier and Anomaly Detection, Integrating Classical and Deep Learning Techniques
    ๐Ÿ”— pyod.readthedocs.io

  28. gristlabs/grist-core โญ 7,762
    Grist is the evolution of spreadsheets.
    ๐Ÿ”— www.getgrist.com

  29. tobymao/sqlglot โญ 7,070
    Python SQL Parser and Transpiler
    ๐Ÿ”— sqlglot.com

  30. alirezamika/autoscraper โญ 6,612
    A Smart, Automatic, Fast and Lightweight Web Scraper for Python

  31. kaggle/kaggle-api โญ 6,411
    Official Kaggle API

  32. madmaze/pytesseract โญ 5,996
    A Python wrapper for Google Tesseract

  33. vi3k6i5/flashtext โญ 5,612
    Extract Keywords from sentence or Replace keywords in sentences.

  34. airbnb/knowledge-repo โญ 5,500
    A next-generation curated knowledge sharing platform for data scientists and other technical professions.

  35. ibis-project/ibis โญ 5,490
    Ibis is a Python library that provides a lightweight, universal interface for data wrangling. It helps Python users explore and transform data of any size, stored anywhere.
    ๐Ÿ”— ibis-project.org

  36. lancedb/lancedb โญ 5,409
    Developer-friendly, serverless vector database for AI applications. Easily add long-term memory to your LLM apps!
    ๐Ÿ”— lancedb.github.io/lancedb

  37. cyclotruc/gitingest โญ 5,408
    Turn any Git repository into a prompt-friendly text ingest for LLMs.
    ๐Ÿ”— gitingest.com

  38. facebookresearch/AugLy โญ 4,988
    A data augmentations library for audio, image, text, and video.
    ๐Ÿ”— ai.facebook.com/blog/augly-a-new-data-augmentation-library-to-help-build-more-robust-ai-models

  39. superduperdb/superduper โญ 4,941
    Superduper: Build end-to-end AI applications and agent workflows on your existing data infrastructure and preferred tools - without migrating your data.
    ๐Ÿ”— superduper.io

  40. jazzband/tablib โญ 4,654
    Python Module for Tabular Datasets in XLS, CSV, JSON, YAML, &c.
    ๐Ÿ”— tablib.readthedocs.io

  41. amundsen-io/amundsen โญ 4,487
    Amundsen is a metadata driven application for improving the productivity of data analysts, data scientists and engineers when interacting with data.
    ๐Ÿ”— www.amundsen.io/amundsen

  42. lk-geimfari/mimesis โญ 4,484
    Mimesis is a robust data generator for Python that can produce a wide range of fake data in multiple languages.
    ๐Ÿ”— mimesis.name

  43. giskard-ai/giskard โญ 4,266
    ๐Ÿข Open-Source Evaluation & Testing for AI & LLM systems
    ๐Ÿ”— docs.giskard.ai

  44. mongodb/mongo-python-driver โญ 4,178
    PyMongo - the Official MongoDB Python driver
    ๐Ÿ”— www.mongodb.com/docs/languages/python/pymongo-driver/current

  45. adbar/trafilatura โญ 3,893
    Python & Command-line tool to gather text and metadata on the Web: Crawling, scraping, extraction, output as CSV, JSON, HTML, MD, TXT, XML
    ๐Ÿ”— trafilatura.readthedocs.io

  46. rom1504/img2dataset โญ 3,874
    Easily turn large sets of image urls to an image dataset. Can download, resize and package 100M urls in 20h on one machine.

  47. andialbrecht/sqlparse โญ 3,809
    A non-validating SQL parser module for Python

  48. jmcnamara/XlsxWriter โญ 3,699
    A Python module for creating Excel XLSX files.
    ๐Ÿ”— xlsxwriter.readthedocs.io

  49. deepchecks/deepchecks โญ 3,698
    Deepchecks: Tests for Continuous Validation of ML Models & Data. Deepchecks is a holistic open-source solution for all of your AI & ML validation needs, enabling to thoroughly test your data and models from research to production.
    ๐Ÿ”— docs.deepchecks.com/stable

  50. praw-dev/praw โญ 3,574
    PRAW, an acronym for "Python Reddit API Wrapper", is a python package that allows for simple access to Reddit's API.
    ๐Ÿ”— praw.readthedocs.io

  51. run-llama/llama-hub โญ 3,465
    A library of data loaders for LLMs made by the community -- to be used with LlamaIndex and/or LangChain
    ๐Ÿ”— llamahub.ai

  52. rapidai/RapidOCR โญ 3,435
    ๐Ÿ“„ Awesome OCR multiple programing languages toolkits based on ONNXRuntime, OpenVINO and PaddlePaddle.
    ๐Ÿ”— rapidai.github.io/rapidocrdocs

  53. pyeve/cerberus โญ 3,189
    Lightweight, extensible data validation library for Python
    ๐Ÿ”— python-cerberus.org

  54. zoomeranalytics/xlwings โญ 3,063
    xlwings is a Python library that makes it easy to call Python from Excel and vice versa. It works with Excel on Windows and macOS as well as with Google Sheets and Excel on the web.
    ๐Ÿ”— www.xlwings.org

  55. sqlalchemy/alembic โญ 3,056
    A database migrations tool for SQLAlchemy.

  56. dlt-hub/dlt โญ 3,055
    data load tool (dlt) is an open source Python library that makes data loading easy ๐Ÿ› ๏ธ
    ๐Ÿ”— dlthub.com/docs

  57. docarray/docarray โญ 3,007
    Represent, send, store and search multimodal data
    ๐Ÿ”— docs.docarray.org

  58. pallets/itsdangerous โญ 2,966
    Safely pass trusted data to untrusted environments and back.
    ๐Ÿ”— itsdangerous.palletsprojects.com

  59. datafold/data-diff โญ 2,959
    Compare tables within or across databases
    ๐Ÿ”— docs.datafold.com

  60. goldsmith/Wikipedia โญ 2,916
    A Pythonic wrapper for the Wikipedia API
    ๐Ÿ”— wikipedia.readthedocs.org

  61. awslabs/amazon-redshift-utils โญ 2,787
    Amazon Redshift Utils contains utilities, scripts and view which are useful in a Redshift environment

  62. mlabonne/llm-datasets โญ 2,615
    Curated list of datasets and tools for post-training.
    ๐Ÿ”— mlabonne.github.io/blog

  63. kayak/pypika โญ 2,613
    PyPika is a python SQL query builder that exposes the full richness of the SQL language using a syntax that reflects the resulting query. PyPika excels at all sorts of SQL queries but is especially useful for data analysis.
    ๐Ÿ”— pypika.readthedocs.io/en/latest

  64. sdv-dev/SDV โญ 2,479
    Synthetic data generation for tabular data
    ๐Ÿ”— docs.sdv.dev/sdv

  65. pynamodb/PynamoDB โญ 2,479
    A pythonic interface to Amazon's DynamoDB
    ๐Ÿ”— pynamodb.readthedocs.io

  66. uqfoundation/dill โญ 2,310
    serialize all of Python
    ๐Ÿ”— dill.rtfd.io

  67. emirozer/fake2db โญ 2,288
    Generate fake but valid data filled databases for test purposes using most popular patterns(AFAIK). Current support is sqlite, mysql, postgresql, mongodb, redis, couchdb.

  68. samuelcolvin/arq โญ 2,281
    Fast job queuing and RPC in python with asyncio and redis.
    ๐Ÿ”— arq-docs.helpmanual.io

  69. pikepdf/pikepdf โญ 2,252
    A Python library for reading and writing PDF, powered by QPDF
    ๐Ÿ”— pikepdf.readthedocs.io

  70. graphistry/pygraphistry โญ 2,203
    PyGraphistry is a Python library to quickly load, shape, embed, and explore big graphs with the GPU-accelerated Graphistry visual graph analyzer

  71. accenture/AmpliGraph โญ 2,183
    Python library for Representation Learning on Knowledge Graphs https://docs.ampligraph.org

  72. huggingface/datatrove โญ 2,173
    DataTrove is a library to process, filter and deduplicate text data at a very large scale. It provides a set of prebuilt commonly used processing blocks with a framework to easily add custom functionality

  73. sfu-db/connector-x โญ 2,095
    Fastest library to load data from DB to DataFrames in Rust and Python
    ๐Ÿ”— sfu-db.github.io/connector-x

  74. aminalaee/sqladmin โญ 2,025
    SQLAlchemy Admin for FastAPI and Starlette
    ๐Ÿ”— aminalaee.dev/sqladmin

  75. milvus-io/bootcamp โญ 1,988
    Dealing with all unstructured data, such as reverse image search, audio search, molecular search, video analysis, question and answer systems, NLP, etc.
    ๐Ÿ”— milvus.io

  76. agronholm/sqlacodegen โญ 1,977
    Automatic model code generator for SQLAlchemy

  77. uber/petastorm โญ 1,812
    Petastorm library enables single machine or distributed training and evaluation of deep learning models from datasets in Apache Parquet format. It supports ML frameworks such as Tensorflow, Pytorch, and PySpark and can be used from pure Python code.

  78. aio-libs/aiomysql โญ 1,780
    aiomysql is a library for accessing a MySQL database from the asyncio
    ๐Ÿ”— aiomysql.rtfd.io

  79. simonw/sqlite-utils โญ 1,748
    Python CLI utility and library for manipulating SQLite databases
    ๐Ÿ”— sqlite-utils.datasette.io

  80. simple-salesforce/simple-salesforce โญ 1,734
    A very simple Salesforce.com REST API client for Python

  81. collerek/ormar โญ 1,699
    python async orm with fastapi in mind and pydantic validation
    ๐Ÿ”— collerek.github.io/ormar

  82. zarr-developers/zarr-python โญ 1,590
    An implementation of chunked, compressed, N-dimensional arrays for Python.
    ๐Ÿ”— zarr.readthedocs.io

  83. eleutherai/the-pile โญ 1,529
    The Pile is a large, diverse, open source language modelling data set that consists of many smaller datasets combined together.

  84. scholarly-python-package/scholarly โญ 1,498
    Retrieve author and publication information from Google Scholar in a friendly, Pythonic way without having to worry about CAPTCHAs!
    ๐Ÿ”— scholarly.readthedocs.io

  85. ydataai/ydata-synthetic โญ 1,483
    Synthetic data generators for tabular and time-series data
    ๐Ÿ”— docs.synthetic.ydata.ai

  86. sdispater/orator โญ 1,422
    The Orator ORM provides a simple yet beautiful ActiveRecord implementation.
    ๐Ÿ”— orator-orm.com

  87. mchong6/JoJoGAN โญ 1,419
    Official PyTorch repo for JoJoGAN: One Shot Face Stylization

  88. google/tensorstore โญ 1,376
    Library for reading and writing large multi-dimensional arrays.
    ๐Ÿ”— google.github.io/tensorstore

  89. quixio/quix-streams โญ 1,289
    Python stream processing for Kafka
    ๐Ÿ”— docs.quix.io

  90. aio-libs/aiocache โญ 1,208
    Asyncio cache manager for redis, memcached and memory
    ๐Ÿ”— aiocache.readthedocs.io

  91. eliasdabbas/advertools โญ 1,181
    advertools - online marketing productivity and analysis tools
    ๐Ÿ”— advertools.readthedocs.io

  92. pytorch/data โญ 1,163
    A PyTorch repo for data loading and utilities to be shared by the PyTorch domain libraries.

  93. d-star-ai/dsRAG โญ 1,139
    A retrieval engine for unstructured data. It is especially good at handling challenging queries over dense text, like financial reports, legal documents, and academic papers.

  94. brettkromkamp/contextualise โญ 1,068
    Contextualise is an effective tool particularly suited for organising information-heavy projects and activities consisting of unstructured and widely diverse data and information resources
    ๐Ÿ”— contextualise.dev

  95. uber/fiber โญ 1,041
    Distributed Computing for AI Made Simple
    ๐Ÿ”— uber.github.io/fiber

  96. intake/intake โญ 1,025
    Intake is a lightweight package for finding, investigating, loading and disseminating data.
    ๐Ÿ”— intake.readthedocs.io

  97. duckdb/dbt-duckdb โญ 976
    dbt (http://getdbt.com) adapter for DuckDB (http://duckdb.org)

  98. igorbenav/fastcrud โญ 931
    FastCRUD is a Python package for FastAPI, offering robust async CRUD operations and flexible endpoint creation utilities.

  99. goccy/bigquery-emulator โญ 890
    BigQuery emulator provides a way to launch a BigQuery server on your local machine for testing and development.

  100. scikit-hep/awkward โญ 861
    Manipulate JSON-like data with NumPy-like idioms.
    ๐Ÿ”— awkward-array.org

  101. macbre/sql-metadata โญ 834
    Uses tokenized query returned by python-sqlparse and generates query metadata
    ๐Ÿ”— pypi.python.org/pypi/sql-metadata

  102. koaning/human-learn โญ 801
    Natural Intelligence is still a pretty good idea.
    ๐Ÿ”— koaning.github.io/human-learn

  103. googleapis/python-bigquery โญ 749
    Python Client for Google BigQuery

  104. hyperqueryhq/whale โญ 725
    ๐Ÿณ The stupidly simple CLI workspace for your data warehouse.
    ๐Ÿ”— rsyi.gitbook.io/whale

  105. dgarnitz/vectorflow โญ 682
    VectorFlow is a high volume vector embedding pipeline that ingests raw data, transforms it into vectors and writes it to a vector DB of your choice.
    ๐Ÿ”— www.getvectorflow.com

  106. kagisearch/vectordb โญ 673
    A minimal Python package for storing and retrieving text using chunking, embeddings, and vector search.
    ๐Ÿ”— vectordb.com

  107. weaviate/recipes โญ 660
    This repository shares end-to-end notebooks on how to use various Weaviate features and integrations!

  108. unstructured-io/unstructured-api โญ 627
    API for Open-Source Pre-Processing Tools for Unstructured Data

  109. apache/iceberg-python โญ 586
    PyIceberg is a Python library for programmatic access to Iceberg table metadata as well as to table data in Iceberg format.
    ๐Ÿ”— py.iceberg.apache.org

  110. jina-ai/vectordb โญ 586
    A Python vector database you just need - no more, no less.

  111. koaning/bulk โญ 563
    Bulk is a quick UI developer tool to apply some bulk labels.

  112. koaning/doubtlab โญ 508
    Doubt your data, find bad labels.
    ๐Ÿ”— koaning.github.io/doubtlab

  113. ibm/data-prep-kit โญ 462
    Data Prep Kit is a community project to democratize and accelerate unstructured data preparation for LLM app developers
    ๐Ÿ”— ibm.github.io/data-prep-kit

  114. titan-systems/titan โญ 450
    Snowflake infrastructure-as-code. Provision environments, automate deploys, CI/CD. Manage RBAC, users, roles, and data access. Declarative Python Resource API.

  115. stackloklabs/promptwright โญ 371
    Promptwright is a Python library designed for generating large synthetic datasets using LLMs

Debugging

Debugging and tracing tools.

  1. cool-rr/PySnooper โญ 16,422
    Never use print for debugging again

  2. gruns/icecream โญ 9,477
    ๐Ÿฆ Never use print() to debug again.

  3. shobrook/rebound โญ 4,123
    Get Stack Overflow results in your terminal whenever an error is thrown

  4. inducer/pudb โญ 3,030
    Full-screen console debugger for Python
    ๐Ÿ”— documen.tician.de/pudb

  5. gotcha/ipdb โญ 1,882
    Integration of IPython pdb

  6. alexmojaki/heartrate โญ 1,806
    Simple real time visualisation of the execution of a Python program.

  7. alexmojaki/birdseye โญ 1,667
    Graphical Python debugger which lets you easily view the values of all evaluated expressions
    ๐Ÿ”— birdseye.readthedocs.io

  8. pdbpp/pdbpp โญ 1,328
    pdb++, a drop-in replacement for pdb (the Python debugger)

  9. alexmojaki/snoop โญ 1,320
    A powerful set of Python debugging tools, based on PySnooper

  10. samuelcolvin/python-devtools โญ 1,004
    Dev tools for python
    ๐Ÿ”— python-devtools.helpmanual.io

Diffusion Text to Image

Text-to-image diffusion model libraries, tools and apps for generating images from natural language.

  1. automatic1111/stable-diffusion-webui โญ 146,849
    Stable Diffusion web UI

  2. compvis/stable-diffusion โญ 69,375
    A latent text-to-image diffusion model
    ๐Ÿ”— ommer-lab.com/research/latent-diffusion-models

  3. comfyanonymous/ComfyUI โญ 65,770
    The most powerful and modular diffusion model GUI, api and backend with a graph/nodes interface.
    ๐Ÿ”— www.comfy.org

  4. stability-ai/stablediffusion โญ 39,940
    High-Resolution Image Synthesis with Latent Diffusion Models

  5. lllyasviel/ControlNet โญ 31,345
    Let us control diffusion models!

  6. huggingface/diffusers โญ 27,374
    ๐Ÿค— Diffusers: State-of-the-art diffusion models for image, video, and audio generation in PyTorch and FLAX.
    ๐Ÿ”— huggingface.co/docs/diffusers

  7. invoke-ai/InvokeAI โญ 24,328
    Invoke is a leading creative engine for Stable Diffusion models, empowering professionals, artists, and enthusiasts to generate and create visual media using the latest AI-driven technologies. The solution offers an industry leading WebUI, and serves as the foundation for multiple commercial products.
    ๐Ÿ”— invoke-ai.github.io/invokeai

  8. openbmb/MiniCPM-o โญ 18,085
    MiniCPM-o 2.6: A GPT-4o Level MLLM for Vision, Speech and Multimodal Live Streaming on Your Phone

  9. apple/ml-stable-diffusion โญ 17,113
    Stable Diffusion with Core ML on Apple Silicon

  10. borisdayma/dalle-mini โญ 14,784
    DALLยทE Mini - Generate images from a text prompt
    ๐Ÿ”— www.craiyon.com

  11. divamgupta/diffusionbee-stable-diffusion-ui โญ 12,959
    Diffusion Bee is the easiest way to run Stable Diffusion locally on your M1 Mac. Comes with a one-click installer. No dependencies or technical knowledge needed.
    ๐Ÿ”— diffusionbee.com

  12. compvis/latent-diffusion โญ 12,265
    High-Resolution Image Synthesis with Latent Diffusion Models

  13. instantid/InstantID โญ 11,360
    InstantID: Zero-shot Identity-Preserving Generation in Seconds ๐Ÿ”ฅ
    ๐Ÿ”— instantid.github.io

  14. lucidrains/DALLE2-pytorch โญ 11,207
    Implementation of DALL-E 2, OpenAI's updated text-to-image synthesis neural network, in Pytorch

  15. facebookresearch/dinov2 โญ 9,734
    PyTorch code and models for the DINOv2 self-supervised learning method.

  16. ashawkey/stable-dreamfusion โญ 8,446
    Text-to-3D & Image-to-3D & Mesh Exportation with NeRF + Diffusion.

  17. carson-katri/dream-textures โญ 7,919
    Stable Diffusion built-in to Blender

  18. xavierxiao/Dreambooth-Stable-Diffusion โญ 7,652
    Implementation of Dreambooth (https://arxiv.org/abs/2208.12242) with Stable Diffusion

  19. idea-research/GroundingDINO โญ 7,297
    [ECCV 2024] Official implementation of the paper "Grounding DINO: Marrying DINO with Grounded Pre-Training for Open-Set Object Detection"
    ๐Ÿ”— arxiv.org/abs/2303.05499

  20. opengvlab/InternVL โญ 6,925
    [CVPR 2024 Oral] InternVL Family: A Pioneering Open-Source Alternative to GPT-4o. ๆŽฅ่ฟ‘GPT-4o่กจ็Žฐ็š„ๅผ€ๆบๅคšๆจกๆ€ๅฏน่ฏๆจกๅž‹
    ๐Ÿ”— internvl.readthedocs.io/en/latest

  21. timothybrooks/instruct-pix2pix โญ 6,491
    PyTorch implementation of InstructPix2Pix, an instruction-based image editing model, based on the original CompVis/stable_diffusion repo.

  22. openai/consistency_models โญ 6,236
    Official repo for consistency models.

  23. salesforce/BLIP โญ 4,989
    PyTorch code for BLIP: Bootstrapping Language-Image Pre-training for Unified Vision-Language Understanding and Generation

  24. nateraw/stable-diffusion-videos โญ 4,496
    Create ๐Ÿ”ฅ videos with Stable Diffusion by exploring the latent space and morphing between text prompts

  25. lkwq007/stablediffusion-infinity โญ 3,865
    Outpainting with Stable Diffusion on an infinite canvas

  26. jina-ai/discoart โญ 3,846
    ๐Ÿชฉ Create Disco Diffusion artworks in one line

  27. mlc-ai/web-stable-diffusion โญ 3,632
    Bringing stable diffusion models to web browsers. Everything runs inside the browser with no server support.
    ๐Ÿ”— mlc.ai/web-stable-diffusion

  28. openai/glide-text2im โญ 3,577
    GLIDE: a diffusion-based text-conditional image synthesis model

  29. openai/improved-diffusion โญ 3,409
    Release for Improved Denoising Diffusion Probabilistic Models

  30. saharmor/dalle-playground โญ 2,765
    A playground to generate images from any text prompt using Stable Diffusion (past: using DALL-E Mini)

  31. google-research/big_vision โญ 2,547
    Official codebase used to develop Vision Transformer, SigLIP, MLP-Mixer, LiT and more.

  32. stability-ai/stability-sdk โญ 2,429
    SDK for interacting with stability.ai APIs (e.g. stable diffusion inference)
    ๐Ÿ”— platform.stability.ai

  33. thudm/CogVLM2 โญ 2,234
    GPT4V-level open-source multi-modal model based on Llama3-8B

  34. open-compass/VLMEvalKit โญ 1,767
    Open-source evaluation toolkit of large multi-modality models (LMMs), support 220+ LMMs, 80+ benchmarks
    ๐Ÿ”— huggingface.co/spaces/opencompass/open_vlm_leaderboard

  35. coyote-a/ultimate-upscale-for-automatic1111 โญ 1,691
    Ultimate SD Upscale extension for AUTOMATIC1111 Stable Diffusion web UI

  36. divamgupta/stable-diffusion-tensorflow โญ 1,593
    Stable Diffusion in TensorFlow / Keras

  37. nvlabs/prismer โญ 1,307
    The implementation of "Prismer: A Vision-Language Model with Multi-Task Experts".
    ๐Ÿ”— shikun.io/projects/prismer

  38. chenyangqiqi/FateZero โญ 1,130
    [ICCV 2023 Oral] "FateZero: Fusing Attentions for Zero-shot Text-based Video Editing"
    ๐Ÿ”— fate-zero-edit.github.io

  39. thereforegames/unprompted โญ 793
    Templating language written for Stable Diffusion workflows. Available as an extension for the Automatic1111 WebUI.

  40. tanelp/tiny-diffusion โญ 702
    A minimal PyTorch implementation of probabilistic diffusion models for 2D datasets.

  41. sharonzhou/long_stable_diffusion โญ 684
    Long-form text-to-images generation, using a pipeline of deep generative models (GPT-3 and Stable Diffusion)

  42. laion-ai/dalle2-laion โญ 501
    Pretrained Dalle2 from laion

Finance

Financial and quantitative libraries: investment research tools, market data, algorithmic trading, backtesting, financial derivatives.

  1. openbb-finance/OpenBB โญ 35,969
    Investment Research for Everyone, Everywhere.
    ๐Ÿ”— openbb.co

  2. quantopian/zipline โญ 17,992
    Zipline, a Pythonic Algorithmic Trading Library
    ๐Ÿ”— www.zipline.io

  3. microsoft/qlib โญ 16,303
    Qlib is an AI-oriented quantitative investment platform that aims to realize the potential, empower research, and create value using AI technologies in quantitative investment, from exploring ideas to implementing productions. Qlib supports diverse machine learning modeling paradigms. including supervised learning, ...
    ๐Ÿ”— qlib.readthedocs.io/en/latest

  4. mementum/backtrader โญ 15,681
    Python Backtesting library for trading strategies
    ๐Ÿ”— www.backtrader.com

  5. ranaroussi/yfinance โญ 15,596
    Download market data from Yahoo! Finance's API
    ๐Ÿ”— ranaroussi.github.io/yfinance

  6. ai4finance-foundation/FinGPT โญ 14,815
    FinGPT: Open-Source Financial Large Language Models! Revolutionize ๐Ÿ”ฅ We release the trained model on HuggingFace.
    ๐Ÿ”— ai4finance.org

  7. ai4finance-foundation/FinRL โญ 10,561
    FinRL: Financial Reinforcement Learning. ๐Ÿ”ฅ
    ๐Ÿ”— ai4finance.org

  8. quantconnect/Lean โญ 10,422
    Lean Algorithmic Trading Engine by QuantConnect (Python, C#)
    ๐Ÿ”— lean.io

  9. ta-lib/ta-lib-python โญ 10,124
    Python wrapper for TA-Lib (http://ta-lib.org/).
    ๐Ÿ”— ta-lib.github.io/ta-lib-python

  10. goldmansachs/gs-quant โญ 8,304
    Python toolkit for quantitative finance
    ๐Ÿ”— developer.gs.com/discover/products/gs-quant

  11. virattt/ai-hedge-fund โญ 7,455
    AI-powered hedge fund. The goal of this project is to explore the use of AI to make trading decisions.

  12. kernc/backtesting.py โญ 5,877
    ๐Ÿ”Ž ๐Ÿ“ˆ ๐Ÿ ๐Ÿ’ฐ Backtest trading strategies in Python.
    ๐Ÿ”— kernc.github.io/backtesting.py

  13. quantopian/pyfolio โญ 5,806
    Portfolio and risk analytics in Python
    ๐Ÿ”— quantopian.github.io/pyfolio

  14. twopirllc/pandas-ta โญ 5,728
    Technical Analysis Indicators - Pandas TA is an easy to use Python 3 Pandas Extension with 150+ Indicators
    ๐Ÿ”— twopirllc.github.io/pandas-ta

  15. ranaroussi/quantstats โญ 5,254
    Portfolio analytics for quants, written in Python

  16. polakowo/vectorbt โญ 4,713
    Find your trading edge, using the fastest engine for backtesting, algorithmic trading, and research.
    ๐Ÿ”— vectorbt.dev

  17. google/tf-quant-finance โญ 4,686
    High-performance TensorFlow library for quantitative finance.

  18. gbeced/pyalgotrade โญ 4,471
    Python Algorithmic Trading Library
    ๐Ÿ”— gbeced.github.io/pyalgotrade

  19. borisbanushev/stockpredictionai โญ 4,415
    In this noteboook I will create a complete process for predicting stock price movements. Follow along and we will achieve some pretty good results. For that purpose we will use a Generative Adversarial Network (GAN) with LSTM, a type of Recurrent Neural Network, as generator, and a Convolutional Neural Networ...

  20. matplotlib/mplfinance โญ 3,831
    Financial Markets Data Visualization using Matplotlib
    ๐Ÿ”— pypi.org/project/mplfinance

  21. quantopian/alphalens โญ 3,516
    Performance analysis of predictive (alpha) stock factors
    ๐Ÿ”— quantopian.github.io/alphalens

  22. cuemacro/finmarketpy โญ 3,514
    Python library for backtesting trading strategies & analyzing financial markets (formerly pythalesians)
    ๐Ÿ”— www.cuemacro.com

  23. zvtvz/zvt โญ 3,369
    modular quant framework.
    ๐Ÿ”— zvt.readthedocs.io/en/latest

  24. robcarver17/pysystemtrade โญ 2,754
    Systematic Trading in python

  25. quantopian/research_public โญ 2,498
    Quantitative research and educational materials
    ๐Ÿ”— www.quantopian.com/lectures

  26. pmorissette/bt โญ 2,378
    bt - flexible backtesting for Python
    ๐Ÿ”— pmorissette.github.io/bt

  27. domokane/FinancePy โญ 2,249
    A Python Finance Library that focuses on the pricing and risk-management of Financial Derivatives, including fixed-income, equity, FX and credit derivatives.

  28. blankly-finance/blankly โญ 2,220
    ๐Ÿš€ ๐Ÿ’ธ Easily build, backtest and deploy your algo in just a few lines of code. Trade stocks, cryptos, and forex across exchanges w/ one package.
    ๐Ÿ”— package.blankly.finance

  29. pmorissette/ffn โญ 2,109
    ffn - a financial function library for Python
    ๐Ÿ”— pmorissette.github.io/ffn

  30. cuemacro/findatapy โญ 1,754
    Python library to download market data via Bloomberg, Eikon, Quandl, Yahoo etc.

  31. quantopian/empyrical โญ 1,329
    Common financial risk and performance metrics. Used by zipline and pyfolio.
    ๐Ÿ”— quantopian.github.io/empyrical

  32. idanya/algo-trader โญ 810
    Trading bot with support for realtime trading, backtesting, custom strategies and much more.

  33. gbeced/basana โญ 647
    A Python async and event driven framework for algorithmic trading, with a focus on crypto currencies.

  34. chancefocus/PIXIU โญ 614
    This repository introduces PIXIU, an open-source resource featuring the first financial large language models (LLMs), instruction tuning data, and evaluation benchmarks to holistically assess financial LLMs. Our goal is to continually push forward the open-source development of financial artificial intelligence (AI).

Game Development

Game development tools, engines and libraries.

  1. kitao/pyxel โญ 15,751
    A retro game engine for Python

  2. pygame/pygame โญ 7,722
    ๐Ÿ๐ŸŽฎ pygame (the library) is a Free and Open Source python programming language library for making multimedia applications like games built on top of the excellent SDL library. C, Python, Native, OpenGL.
    ๐Ÿ”— www.pygame.org

  3. microsoft/TRELLIS โญ 7,413
    A large 3D asset generation model. It takes in text or image prompts and generates high-quality 3D assets in various formats, such as Radiance Fields, 3D Gaussians, and meshes.
    ๐Ÿ”— trellis3d.github.io

  4. panda3d/panda3d โญ 4,634
    Powerful, mature open-source cross-platform game engine for Python and C++, developed by Disney and CMU
    ๐Ÿ”— www.panda3d.org

  5. niklasf/python-chess โญ 2,504
    python-chess is a chess library for Python, with move generation, move validation, and support for common formats
    ๐Ÿ”— python-chess.readthedocs.io/en/latest

  6. pokepetter/ursina โญ 2,277
    A game engine powered by python and panda3d.
    ๐Ÿ”— pokepetter.github.io/ursina

  7. pyglet/pyglet โญ 1,955
    pyglet is a cross-platform windowing and multimedia library for Python, for developing games and other visually rich applications.
    ๐Ÿ”— pyglet.org

  8. pythonarcade/arcade โญ 1,747
    Easy to use Python library for creating 2D arcade games.
    ๐Ÿ”— arcade.academy

GIS

Geospatial libraries: raster and vector data formats, interactive mapping and visualisation, computing frameworks for processing images, projections.

  1. domlysz/BlenderGIS โญ 7,984
    Blender addons to make the bridge between Blender and geographic data

  2. python-visualization/folium โญ 7,025
    Python Data. Leaflet.js Maps.
    ๐Ÿ”— python-visualization.github.io/folium

  3. osgeo/gdal โญ 5,075
    GDAL is an open source MIT licensed translator library for raster and vector geospatial data formats.
    ๐Ÿ”— gdal.org

  4. gboeing/osmnx โญ 4,988
    Python package to easily download, model, analyze, and visualize street networks and other geospatial features from OpenStreetMap.
    ๐Ÿ”— osmnx.readthedocs.io

  5. geopandas/geopandas โญ 4,623
    Python tools for geographic data
    ๐Ÿ”— geopandas.org

  6. shapely/shapely โญ 3,998
    Manipulation and analysis of geometric objects
    ๐Ÿ”— shapely.readthedocs.io/en/stable

  7. giswqs/geemap โญ 3,553
    A Python package for interactive geospatial analysis and visualization with Google Earth Engine.
    ๐Ÿ”— geemap.org

  8. holoviz/datashader โญ 3,371
    Quickly and accurately render even the largest data.
    ๐Ÿ”— datashader.org

  9. opengeos/leafmap โญ 3,263
    A Python package for interactive mapping and geospatial analysis with minimal coding in a Jupyter environment
    ๐Ÿ”— leafmap.org

  10. microsoft/torchgeo โญ 3,189
    TorchGeo: datasets, samplers, transforms, and pre-trained models for geospatial data
    ๐Ÿ”— www.osgeo.org/projects/torchgeo

  11. opengeos/segment-geospatial โญ 3,148
    A Python package for segmenting geospatial data with the Segment Anything Model (SAM)
    ๐Ÿ”— samgeo.gishub.org

  12. google/earthengine-api โญ 2,752
    Python and JavaScript bindings for calling the Earth Engine API.

  13. rasterio/rasterio โญ 2,304
    Rasterio reads and writes geospatial raster datasets
    ๐Ÿ”— rasterio.readthedocs.io

  14. mcordts/cityscapesScripts โญ 2,203
    README and scripts for the Cityscapes Dataset

  15. azavea/raster-vision โญ 2,111
    An open source library and framework for deep learning on satellite and aerial imagery.
    ๐Ÿ”— docs.rastervision.io

  16. apache/sedona โญ 1,991
    A cluster computing framework for processing large-scale geospatial data
    ๐Ÿ”— sedona.apache.org

  17. plant99/felicette โญ 1,820
    Satellite imagery for dummies.

  18. gboeing/osmnx-examples โญ 1,616
    Gallery of OSMnx tutorials, usage examples, and feature demonstations.
    ๐Ÿ”— osmnx.readthedocs.io

  19. jupyter-widgets/ipyleaflet โญ 1,504
    A Jupyter - Leaflet.js bridge
    ๐Ÿ”— ipyleaflet.readthedocs.io

  20. microsoft/GlobalMLBuildingFootprints โญ 1,467
    Worldwide building footprints derived from satellite imagery

  21. pysal/pysal โญ 1,358
    PySAL: Python Spatial Analysis Library Meta-Package
    ๐Ÿ”— pysal.org/pysal

  22. anitagraser/movingpandas โญ 1,266
    Movement trajectory classes and functions built on top of GeoPandas
    ๐Ÿ”— movingpandas.org

  23. residentmario/geoplot โญ 1,164
    High-level geospatial data visualization library for Python.
    ๐Ÿ”— residentmario.github.io/geoplot/index.html

  24. sentinel-hub/eo-learn โญ 1,145
    Earth observation processing framework for machine learning in Python
    ๐Ÿ”— eo-learn.readthedocs.io/en/latest

  25. opengeos/streamlit-geospatial โญ 904
    A multi-page streamlit app for geospatial
    ๐Ÿ”— huggingface.co/spaces/giswqs/streamlit

  26. osgeo/grass โญ 876
    GRASS GIS - free and open-source geospatial processing engine
    ๐Ÿ”— grass.osgeo.org

  27. makepath/xarray-spatial โญ 860
    Raster-based Spatial Analytics for Python
    ๐Ÿ”— xarray-spatial.readthedocs.io

  28. developmentseed/titiler โญ 823
    Build your own Raster dynamic map tile services
    ๐Ÿ”— developmentseed.org/titiler

Graph

Graphs and network libraries: network analysis, graph machine learning, visualisation.

  1. networkx/networkx โญ 15,349
    Network Analysis in Python
    ๐Ÿ”— networkx.org

  2. stellargraph/stellargraph โญ 2,967
    StellarGraph - Machine Learning on Graphs
    ๐Ÿ”— stellargraph.readthedocs.io

  3. westhealth/pyvis โญ 1,051
    Python package for creating and visualizing interactive network graphs.
    ๐Ÿ”— pyvis.readthedocs.io/en/latest

  4. microsoft/graspologic โญ 853
    graspologic is a package for graph statistical algorithms
    ๐Ÿ”— graspologic-org.github.io/graspologic

  5. rampasek/GraphGPS โญ 699
    Recipe for a General, Powerful, Scalable Graph Transformer

  6. dylanhogg/llmgraph โญ 370
    Create knowledge graphs with LLMs

GUI

Graphical user interface libraries and toolkits.

  1. hoffstadt/DearPyGui โญ 13,672
    Dear PyGui: A fast and powerful Graphical User Interface Toolkit for Python with minimal dependencies
    ๐Ÿ”— dearpygui.readthedocs.io/en/latest

  2. pysimplegui/PySimpleGUI โญ 13,542
    Python GUIs for Humans! PySimpleGUI is the top-rated Python application development environment. Launched in 2018 and actively developed, maintained, and supported in 2024. Transforms tkinter, Qt, WxPython, and Remi into a simple, intuitive, and fun experience for both hobbyists and expert users.
    ๐Ÿ”— www.pysimplegui.com

  3. parthjadhav/Tkinter-Designer โญ 9,511
    An easy and fast way to create a Python GUI ๐Ÿ

  4. samuelcolvin/FastUI โญ 8,650
    FastUI is a new way to build web application user interfaces defined by declarative Python code.
    ๐Ÿ”— fastui-demo.onrender.com

  5. r0x0r/pywebview โญ 4,944
    Build GUI for your Python program with JavaScript, HTML, and CSS
    ๐Ÿ”— pywebview.flowrl.com

  6. beeware/toga โญ 4,486
    A Python native, OS native GUI toolkit.
    ๐Ÿ”— toga.readthedocs.io/en/latest

  7. dddomodossola/remi โญ 3,549
    Python REMote Interface library. Platform independent. In about 100 Kbytes, perfect for your diet.

  8. wxwidgets/Phoenix โญ 2,381
    wxPython's Project Phoenix. A new implementation of wxPython, better, stronger, faster than he was before.
    ๐Ÿ”— wxpython.org

Jupyter

Jupyter and JupyterLab and Notebook tools, libraries and plugins.

  1. jupyterlab/jupyterlab โญ 14,356
    JupyterLab computational environment.
    ๐Ÿ”— jupyterlab.readthedocs.io

  2. jupyter/notebook โญ 11,991
    Jupyter Interactive Notebook
    ๐Ÿ”— jupyter-notebook.readthedocs.io

  3. marimo-team/marimo โญ 10,162
    A reactive Python notebook: run a cell or interact with a UI element, and marimo automatically runs dependent cells, keeping code and outputs consistent. marimo notebooks are stored as pure Python, executable as scripts, and deployable as apps.
    ๐Ÿ”— marimo.io

  4. mwouts/jupytext โญ 6,728
    Jupyter Notebooks as Markdown Documents, Julia, Python or R scripts
    ๐Ÿ”— jupytext.readthedocs.io

  5. nteract/papermill โญ 6,069
    ๐Ÿ“š Parameterize, execute, and analyze notebooks
    ๐Ÿ”— papermill.readthedocs.io/en/latest

  6. connorferster/handcalcs โญ 5,699
    Python library for converting Python calculations into rendered latex.

  7. voila-dashboards/voila โญ 5,558
    Voilร  turns Jupyter notebooks into standalone web applications
    ๐Ÿ”— voila.readthedocs.io

  8. jupyterlite/jupyterlite โญ 3,992
    Wasm powered Jupyter running in the browser ๐Ÿ’ก
    ๐Ÿ”— jupyterlite.rtfd.io/en/stable/try/lab

  9. executablebooks/jupyter-book โญ 3,956
    Create beautiful, publication-quality books and documents from computational content.
    ๐Ÿ”— jupyterbook.org

  10. jupyterlab/jupyterlab-desktop โญ 3,855
    JupyterLab desktop application, based on Electron.

  11. jupyterlab/jupyter-ai โญ 3,382
    A generative AI extension for JupyterLab
    ๐Ÿ”— jupyter-ai.readthedocs.io

  12. jupyter-widgets/ipywidgets โญ 3,190
    Interactive Widgets for the Jupyter Notebook
    ๐Ÿ”— ipywidgets.readthedocs.io

  13. quantopian/qgrid โญ 3,061
    An interactive grid for sorting, filtering, and editing DataFrames in Jupyter notebooks

  14. jupyter/nbdime โญ 2,696
    Tools for diffing and merging of Jupyter notebooks.
    ๐Ÿ”— nbdime.readthedocs.io

  15. mito-ds/mito โญ 2,342
    Jupyter extensions that help you write code faster: Context aware AI Chat, Autocomplete, and Spreadsheet
    ๐Ÿ”— trymito.io

  16. jupyter/nbviewer โญ 2,224
    nbconvert as a web service: Render Jupyter Notebooks as static web pages
    ๐Ÿ”— nbviewer.jupyter.org

  17. maartenbreddels/ipyvolume โญ 1,953
    3d plotting for Python in the Jupyter notebook based on IPython widgets using WebGL

  18. jupyter-lsp/jupyterlab-lsp โญ 1,842
    Coding assistance for JupyterLab (code navigation + hover suggestions + linters + autocompletion + rename) using Language Server Protocol
    ๐Ÿ”— jupyterlab-lsp.readthedocs.io

  19. jupyter/nbconvert โญ 1,783
    Jupyter Notebook Conversion
    ๐Ÿ”— nbconvert.readthedocs.io

  20. koaning/drawdata โญ 1,222
    Draw datasets from within Jupyter.

  21. 8080labs/pyforest โญ 1,109
    With pyforest you can use all your favorite Python libraries without importing them before. If you use a package that is not imported yet, pyforest imports the package for you and adds the code to the first Jupyter cell.
    ๐Ÿ”— 8080labs.com

  22. nbqa-dev/nbQA โญ 1,076
    Run ruff, isort, pyupgrade, mypy, pylint, flake8, and more on Jupyter Notebooks
    ๐Ÿ”— nbqa.readthedocs.io/en/latest/index.html

  23. vizzuhq/ipyvizzu โญ 959
    Build animated charts in Jupyter Notebook and similar environments with a simple Python syntax.
    ๐Ÿ”— ipyvizzu.vizzuhq.com

  24. aws/graph-notebook โญ 750
    Library extending Jupyter notebooks to integrate with Apache TinkerPop, openCypher, and RDF SPARQL.
    ๐Ÿ”— github.com/aws/graph-notebook

  25. linealabs/lineapy โญ 665
    Move fast from data science prototype to pipeline. Capture, analyze, and transform messy notebooks into data pipelines with just two lines of code.
    ๐Ÿ”— lineapy.org

  26. xiaohk/stickyland โญ 540
    Break the linear presentation of Jupyter Notebooks with sticky cells!
    ๐Ÿ”— xiaohk.github.io/stickyland

  27. infuseai/colab-xterm โญ 423
    Open a terminal in colab, including the free tier.

LLMs and ChatGPT

Large language model and GPT libraries and frameworks: auto-gpt, agents, QnA, chain-of-thought workflows, API integations. Also see the Natural Language Processing category for crossover.

  1. significant-gravitas/AutoGPT โญ 171,082
    AutoGPT is the vision of accessible AI for everyone, to use and to build on. Our mission is to provide the tools, so that you can focus on what matters.
    ๐Ÿ”— agpt.co

  2. ggerganov/llama.cpp โญ 72,991
    LLM inference in C/C++

  3. deepseek-ai/DeepSeek-V3 โญ 72,238
    A strong Mixture-of-Experts (MoE) language model with 671B total parameters with 37B activated for each token.

  4. nomic-ai/gpt4all โญ 72,183
    GPT4All: Run Local LLMs on Any Device. Open-source and available for commercial use.
    ๐Ÿ”— nomic.ai/gpt4all

  5. open-webui/open-webui โญ 67,137
    Open WebUI is an extensible, feature-rich, and user-friendly self-hosted AI platform designed to operate entirely offline. It supports various LLM runners like Ollama and OpenAI-compatible APIs, with built-in inference engine for RAG
    ๐Ÿ”— openwebui.com

  6. xtekky/gpt4free โญ 63,309
    The official gpt4free repository | various collection of powerful language models | gpt-4o and deepseek v3 & r1
    ๐Ÿ”— t.me/g4f_channel

  7. killianlucas/open-interpreter โญ 58,149
    A natural language interface for computers
    ๐Ÿ”— openinterpreter.com

  8. facebookresearch/llama โญ 57,463
    Inference code for Llama models

  9. imartinez/private-gpt โญ 55,091
    Interact with your documents using the power of GPT, 100% privately, no data leaks
    ๐Ÿ”— privategpt.dev

  10. gpt-engineer-org/gpt-engineer โญ 53,005
    Platform to experiment with the AI Software Engineer. Terminal based. NOTE: Very different from https://gptengineer.app

  11. xai-org/grok-1 โญ 49,890
    This repository contains JAX example code for loading and running the Grok-1 open-weights model.

  12. geekan/MetaGPT โญ 45,927
    ๐ŸŒŸ The Multi-Agent Framework: First AI Software Company, Towards Natural Language Programming
    ๐Ÿ”— deepwisdom.ai

  13. oobabooga/text-generation-webui โญ 42,187
    A Gradio web UI for Large Language Models with support for multiple inference backends.

  14. thudm/ChatGLM-6B โญ 41,017
    ChatGLM-6B: An Open Bilingual Dialogue Language Model | ๅผ€ๆบๅŒ่ฏญๅฏน่ฏ่ฏญ่จ€ๆจกๅž‹

  15. hiyouga/LLaMA-Factory โญ 39,299
    Unified Efficient Fine-Tuning of 100+ LLMs & VLMs (ACL 2024)
    ๐Ÿ”— huggingface.co/papers/2403.13372

  16. hpcaitech/ColossalAI โญ 39,042
    Making large AI models cheaper, faster and more accessible
    ๐Ÿ”— www.colossalai.org

  17. karpathy/nanoGPT โญ 39,008
    The simplest, fastest repository for training/finetuning medium-sized GPTs.

  18. lm-sys/FastChat โญ 37,647
    An open platform for training, serving, and evaluating large language models. Release repo for Vicuna and Chatbot Arena.

  19. quivrhq/quivr โญ 37,201
    Opiniated RAG for integrating GenAI in your apps ๐Ÿง  Focus on your product rather than the RAG. Easy integration in existing products with customisation! Any LLM: GPT4, Groq, Llama. Any Vectorstore: PGVector, Faiss. Any Files. Anyway you want.
    ๐Ÿ”— core.quivr.com

  20. laion-ai/Open-Assistant โญ 37,193
    OpenAssistant is a chat-based assistant that understands tasks, can interact with third-party systems, and retrieve information dynamically to do so.
    ๐Ÿ”— open-assistant.io

  21. vllm-project/vllm โญ 36,204
    A high-throughput and memory-efficient inference and serving engine for LLMs
    ๐Ÿ”— docs.vllm.ai

  22. moymix/TaskMatrix โญ 34,542
    Connects ChatGPT and a series of Visual Foundation Models to enable sending and receiving images during chatting.

  23. pythagora-io/gpt-pilot โญ 32,286
    The first real AI developer

  24. infiniflow/ragflow โญ 31,032
    RAGFlow is an open-source RAG (Retrieval-Augmented Generation) engine based on deep document understanding.
    ๐Ÿ”— ragflow.io

  25. tatsu-lab/stanford_alpaca โญ 29,777
    Code and documentation to train Stanford's Alpaca models, and generate the data.
    ๐Ÿ”— crfm.stanford.edu/2023/03/13/alpaca.html

  26. unclecode/crawl4ai โญ 28,696
    AI-ready web crawling tailored for LLMs, AI agents, and data pipelines. Open source, flexible, and built for real-time performance, Crawl4AI empowers developers with unmatched speed, precision, and deployment ease.
    ๐Ÿ”— crawl4ai.com

  27. danielmiessler/fabric โญ 28,305
    fabric is an open-source framework for augmenting humans using AI. It provides a modular framework for solving specific problems using a crowdsourced set of AI prompts that can be used anywhere.
    ๐Ÿ”— danielmiessler.com/p/fabric-origin-story

  28. meta-llama/llama3 โญ 28,175
    The official Meta Llama 3 GitHub site

  29. khoj-ai/khoj โญ 25,860
    Your AI second brain. Self-hostable. Get answers from the web or your docs. Build custom agents, schedule automations, do deep research. Turn any online or local LLM into your personal, autonomous AI
    ๐Ÿ”— khoj.dev

  30. vision-cair/MiniGPT-4 โญ 25,543
    Open-sourced codes for MiniGPT-4 and MiniGPT-v2 (https://minigpt-4.github.io, https://minigpt-v2.github.io/)
    ๐Ÿ”— minigpt-4.github.io

  31. karpathy/llm.c โญ 25,221
    LLM training in simple, pure C/CUDA. There is no need for 245MB of PyTorch or 107MB of cPython

  32. embedchain/mem0 โญ 24,373
    The Memory layer for AI Agents
    ๐Ÿ”— mem0.ai

  33. microsoft/JARVIS โญ 23,904
    JARVIS, a system to connect LLMs with ML community. Paper: https://arxiv.org/pdf/2303.17580.pdf

  34. unslothai/unsloth โญ 23,135
    Finetune Llama 3.3, DeepSeek-R1, Mistral, Phi-4 & Gemma 2 LLMs 2-5x faster with 70% less memory
    ๐Ÿ”— unsloth.ai

  35. microsoft/semantic-kernel โญ 22,913
    Integrate cutting-edge LLM technology quickly and easily into your apps
    ๐Ÿ”— aka.ms/semantic-kernel

  36. openai/gpt-2 โญ 22,899
    Code for the paper "Language Models are Unsupervised Multitask Learners"
    ๐Ÿ”— openai.com/blog/better-language-models

  37. microsoft/graphrag โญ 22,084
    A modular graph-based Retrieval-Augmented Generation (RAG) system
    ๐Ÿ”— microsoft.github.io/graphrag

  38. stanfordnlp/dspy โญ 21,643
    DSPy: The framework for programmingโ€”not promptingโ€”language models
    ๐Ÿ”— dspy.ai

  39. haotian-liu/LLaVA โญ 21,285
    [NeurIPS'23 Oral] Visual Instruction Tuning (LLaVA) built towards GPT-4V level capabilities and beyond.
    ๐Ÿ”— llava.hliu.cc

  40. karpathy/minGPT โญ 21,242
    A minimal PyTorch re-implementation of the OpenAI GPT (Generative Pretrained Transformer) training

  41. openai/chatgpt-retrieval-plugin โญ 21,105
    The ChatGPT Retrieval Plugin lets you easily find personal or work documents by asking questions in natural language.

  42. cinnamon/kotaemon โญ 20,768
    An open-source RAG UI for chatting with your documents. Built with both end users and developers in mind
    ๐Ÿ”— cinnamon.github.io/kotaemon

  43. mlc-ai/mlc-llm โญ 19,850
    Universal LLM Deployment Engine with ML Compilation
    ๐Ÿ”— llm.mlc.ai

  44. guidance-ai/guidance โญ 19,576
    A guidance language for controlling large language models.

  45. rasahq/rasa โญ 19,335
    ๐Ÿ’ฌ Open source machine learning framework to automate text- and voice-based conversations: NLU, dialogue management, connect to Slack, Facebook, and more - Create chatbots and voice assistants
    ๐Ÿ”— rasa.com/docs/rasa

  46. deepset-ai/haystack โญ 18,969
    AI orchestration framework to build customizable, production-ready LLM applications. Connect components (models, vector DBs, file converters) to pipelines or agents that can interact with your data. With advanced retrieval methods, it's best suited for building RAG, question answering, semantic search or conversatio...
    ๐Ÿ”— haystack.deepset.ai

  47. stitionai/devika โญ 18,851
    Devika is an advanced AI software engineer that can understand high-level human instructions, break them down into steps, research relevant information, and write code to achieve the given objective.

  48. tloen/alpaca-lora โญ 18,792
    Instruct-tune LLaMA on consumer hardware

  49. karpathy/llama2.c โญ 17,970
    Inference Llama 2 in one file of pure C

  50. huggingface/peft โญ 17,146
    ๐Ÿค— PEFT: State-of-the-art Parameter-Efficient Fine-Tuning.
    ๐Ÿ”— huggingface.co/docs/peft

  51. berriai/litellm โญ 16,967
    Python SDK, Proxy Server (LLM Gateway) to call 100+ LLM APIs in OpenAI format - [Bedrock, Azure, OpenAI, VertexAI, Cohere, Anthropic, Sagemaker, HuggingFace, Replicate, Groq]
    ๐Ÿ”— docs.litellm.ai/docs

  52. qwenlm/Qwen โญ 16,475
    The official repo of Qwen (้€šไน‰ๅƒ้—ฎ) chat & pretrained large language model proposed by Alibaba Cloud.

  53. facebookresearch/codellama โญ 16,175
    Inference code for CodeLlama models

  54. facebookresearch/llama-cookbook โญ 16,111
    Welcome to the Llama Cookbook! This is your go to guide for Building with Llama: Getting started with Inference, Fine-Tuning, RAG. We also show you how to solve end to end problems using Llama model family and using them on various provider services
    ๐Ÿ”— www.llama.com

  55. transformeroptimus/SuperAGI โญ 15,815
    <โšก๏ธ> SuperAGI - A dev-first open source autonomous AI agent framework. Enabling developers to build, manage & run useful autonomous agents quickly and reliably.
    ๐Ÿ”— superagi.com

  56. thudm/ChatGLM2-6B โญ 15,759
    ChatGLM2-6B: An Open Bilingual Chat LLM | ๅผ€ๆบๅŒ่ฏญๅฏน่ฏ่ฏญ่จ€ๆจกๅž‹

  57. idea-research/Grounded-Segment-Anything โญ 15,658
    Grounded SAM: Marrying Grounding DINO with Segment Anything & Stable Diffusion & Recognize Anything - Automatically Detect , Segment and Generate Anything
    ๐Ÿ”— arxiv.org/abs/2401.14159

  58. huggingface/open-r1 โญ 15,441
    The goal of this repo is to build the missing pieces of the R1 pipeline such that everybody can reproduce and build on top of it

  59. openai/evals โญ 15,436
    Evals is a framework for evaluating LLMs and LLM systems, and an open-source registry of benchmarks.

  60. dao-ailab/flash-attention โญ 15,290
    Fast and memory-efficient exact attention

  61. mayooear/gpt4-pdf-chatbot-langchain โญ 15,046
    GPT4 & LangChain Chatbot for large PDF docs
    ๐Ÿ”— www.youtube.com/watch?v=ih9pbgvvoo4

  62. fauxpilot/fauxpilot โญ 14,660
    FauxPilot - an open-source alternative to GitHub Copilot server

  63. mlc-ai/web-llm โญ 14,544
    High-performance In-browser LLM Inference Engine
    ๐Ÿ”— webllm.mlc.ai

  64. blinkdl/RWKV-LM โญ 13,081
    RWKV (pronounced RwaKuv) is an RNN with great LLM performance, which can also be directly trained like a GPT transformer (parallelizable). We are at RWKV-7 "Goose". So it's combining the best of RNN and transformer - great performance, linear time, constant space (no kv-cache), fast training, infinite ctx_len, and f...

  65. vanna-ai/vanna โญ 12,942
    ๐Ÿค– Chat with your SQL database ๐Ÿ“Š. Accurate Text-to-SQL Generation via LLMs using RAG ๐Ÿ”„.
    ๐Ÿ”— vanna.ai/docs

  66. microsoft/BitNet โญ 12,691
    Official inference framework for 1-bit LLMs (e.g., BitNet b1.58). It offers a suite of optimized kernels, that support fast and lossless inference of 1.58-bit models

  67. pathwaycom/llm-app โญ 12,582
    Ready-to-run cloud templates for RAG, AI pipelines, and enterprise search with live data. ๐ŸณDocker-friendly.โšกAlways in sync with Sharepoint, Google Drive, S3, Kafka, PostgreSQL, real-time data APIs, and more.
    ๐Ÿ”— pathway.com/developers/templates

  68. paddlepaddle/PaddleNLP โญ 12,312
    ๐Ÿ‘‘ Easy-to-use and powerful NLP and LLM library with ๐Ÿค— Awesome model zoo, supporting wide-range of NLP tasks from research to industrial applications, including ๐Ÿ—‚Text Classification, ๐Ÿ” Neural Search, โ“ Question Answering, โ„น๏ธ Information Extraction, ๐Ÿ“„ Document Intelligence, ๐Ÿ’Œ Sentiment Analysis etc.
    ๐Ÿ”— paddlenlp.readthedocs.io

  69. openlmlab/MOSS โญ 12,024
    An open-source tool-augmented conversational language model from Fudan University
    ๐Ÿ”— txsun1997.github.io/blogs/moss.html

  70. skyvern-ai/skyvern โญ 11,908
    Skyvern automates browser-based workflows using LLMs and computer vision. It provides a simple API endpoint to fully automate manual workflows, replacing brittle or unreliable automation solutions.
    ๐Ÿ”— www.skyvern.com

  71. shishirpatil/gorilla โญ 11,743
    Enables LLMs to use tools by invoking APIs. Given a query, Gorilla comes up with the semantically and syntactically correct API.
    ๐Ÿ”— gorilla.cs.berkeley.edu

  72. nirdiamant/RAG_Techniques โญ 11,713
    The most comprehensive and dynamic collections of Retrieval-Augmented Generation (RAG) tutorials available today. This repository serves as a hub for cutting-edge techniques aimed at enhancing the accuracy, efficiency, and contextual richness of RAG systems.

  73. h2oai/h2ogpt โญ 11,626
    Private chat with local GPT with document, images, video, etc. 100% private, Apache 2.0. Supports oLLaMa, Mixtral, llama.cpp, and more. Demo: https://gpt.h2o.ai/ https://gpt-docs.h2o.ai/
    ๐Ÿ”— h2o.ai

  74. lightning-ai/litgpt โญ 11,435
    20+ high-performance LLMs with recipes to pretrain, finetune and deploy at scale.
    ๐Ÿ”— lightning.ai

  75. lightning-ai/litgpt โญ 11,435
    20+ high-performance LLMs with recipes to pretrain, finetune and deploy at scale.
    ๐Ÿ”— lightning.ai

  76. nvidia/Megatron-LM โญ 11,243
    Ongoing research training transformer models at scale
    ๐Ÿ”— docs.nvidia.com/megatron-core/developer-guide/latest/user-guide/index.html#quick-start

  77. microsoft/LoRA โญ 11,216
    Code for loralib, an implementation of "LoRA: Low-Rank Adaptation of Large Language Models"
    ๐Ÿ”— arxiv.org/abs/2106.09685

  78. lvwerra/trl โญ 11,053
    Train transformer language models with reinforcement learning.
    ๐Ÿ”— hf.co/docs/trl

  79. google-research/vision_transformer โญ 10,845
    Vision Transformer and MLP-Mixer Architectures

  80. databrickslabs/dolly โญ 10,805
    Databricksโ€™ Dolly, a large language model trained on the Databricks Machine Learning Platform
    ๐Ÿ”— www.databricks.com/blog/2023/03/24/hello-dolly-democratizing-magic-chatgpt-open-models.html

  81. outlines-dev/outlines โญ 10,572
    Structured Text Generation from LLMs
    ๐Ÿ”— dottxt-ai.github.io/outlines

  82. artidoro/qlora โญ 10,210
    QLoRA: Efficient Finetuning of Quantized LLMs
    ๐Ÿ”— arxiv.org/abs/2305.14314

  83. anthropics/anthropic-cookbook โญ 10,161
    Provides code and guides designed to help developers build with Claude, offering copy-able code snippets that you can easily integrate into your own projects.

  84. andrewyng/aisuite โญ 9,911
    Simple, unified interface to multiple Generative AI providers. aisuite makes it easy for developers to use multiple LLM through a standardized interface.

  85. mistralai/mistral-inference โญ 9,908
    Official inference library for Mistral models
    ๐Ÿ”— mistral.ai

  86. microsoft/promptflow โญ 9,870
    Build high-quality LLM apps - from prototyping, testing to production deployment and monitoring.
    ๐Ÿ”— microsoft.github.io/promptflow

  87. prompt-toolkit/python-prompt-toolkit โญ 9,511
    Library for building powerful interactive command line applications in Python
    ๐Ÿ”— python-prompt-toolkit.readthedocs.io

  88. mshumer/gpt-prompt-engineer โญ 9,463
    Simply input a description of your task and some test cases, and the system will generate, test, and rank a multitude of prompts to find the ones that perform the best.

  89. blinkdl/ChatRWKV โญ 9,448
    ChatRWKV is like ChatGPT but powered by RWKV (100% RNN) language model, and open source.

  90. karpathy/minbpe โญ 9,362
    Minimal, clean code for the Byte Pair Encoding (BPE) algorithm commonly used in LLM tokenization.

  91. swivid/F5-TTS โญ 9,339
    Official code for "F5-TTS: A Fairytaler that Fakes Fluent and Faithful Speech with Flow Matching"
    ๐Ÿ”— arxiv.org/abs/2410.06885

  92. jxnl/instructor โญ 9,231
    Instructor is a Python library that makes it a breeze to work with structured outputs from large language models (LLMs). Built on top of Pydantic, it provides a simple, transparent, and user-friendly API to manage validation, retries, and streaming responses.
    ๐Ÿ”— python.useinstructor.com

  93. llmware-ai/llmware โญ 8,578
    Unified framework for building enterprise RAG pipelines with small, specialized models
    ๐Ÿ”— llmware-ai.github.io/llmware

  94. apple/ml-ferret โญ 8,556
    Ferret: Refer and Ground Anything Anywhere at Any Granularity

  95. abetlen/llama-cpp-python โญ 8,536
    Simple Python bindings for @ggerganov's llama.cpp library.
    ๐Ÿ”— llama-cpp-python.readthedocs.io

  96. axolotl-ai-cloud/axolotl โญ 8,472
    Go ahead and axolotl questions
    ๐Ÿ”— axolotl-ai-cloud.github.io/axolotl

  97. sgl-project/sglang โญ 8,442
    SGLang is a fast serving framework for large language models and vision language models.
    ๐Ÿ”— docs.sglang.ai

  98. chainlit/chainlit โญ 8,434
    Build Conversational AI in minutes โšก๏ธ
    ๐Ÿ”— docs.chainlit.io

  99. thudm/CodeGeeX โญ 8,353
    CodeGeeX: An Open Multilingual Code Generation Model (KDD 2023)
    ๐Ÿ”— codegeex.cn

  100. optimalscale/LMFlow โญ 8,339
    An Extensible Toolkit for Finetuning and Inference of Large Foundation Models. Large Models for All.
    ๐Ÿ”— optimalscale.github.io/lmflow

  101. eleutherai/gpt-neo โญ 8,266
    An implementation of model parallel GPT-2 and GPT-3-style models using the mesh-tensorflow library.
    ๐Ÿ”— www.eleuther.ai

  102. jzhang38/TinyLlama โญ 8,162
    The TinyLlama project is an open endeavor to pretrain a 1.1B Llama model on 3 trillion tokens.

  103. sjtu-ipads/PowerInfer โญ 8,072
    High-speed Large Language Model Serving for Local Deployment

  104. explodinggradients/ragas โญ 8,053
    Supercharge Your LLM Application Evaluations ๐Ÿš€
    ๐Ÿ”— docs.ragas.io

  105. lianjiatech/BELLE โญ 8,029
    BELLE: Be Everyone's Large Language model Engine๏ผˆๅผ€ๆบไธญๆ–‡ๅฏน่ฏๅคงๆจกๅž‹๏ผ‰

  106. vaibhavs10/insanely-fast-whisper โญ 8,028
    An opinionated CLI to transcribe Audio files w/ Whisper on-device! Powered by ๐Ÿค— Transformers, Optimum & flash-attn

  107. 01-ai/Yi โญ 7,802
    The Yi series models are the next generation of open-source large language models trained from scratch by 01.AI.
    ๐Ÿ”— 01.ai

  108. plachtaa/VALL-E-X โญ 7,767
    An open source implementation of Microsoft's VALL-E X zero-shot TTS model. Demo is available in https://plachtaa.github.io/vallex/

  109. jiayi-pan/TinyZero โญ 7,718
    TinyZero is a reproduction of DeepSeek R1 Zero in countdown and multiplication tasks.

  110. thudm/GLM-130B โญ 7,679
    GLM-130B: An Open Bilingual Pre-Trained Model (ICLR 2023)

  111. eleutherai/lm-evaluation-harness โญ 7,637
    A framework for few-shot evaluation of language models.
    ๐Ÿ”— www.eleuther.ai

  112. anthropics/anthropic-quickstarts โญ 7,578
    A collection of projects designed to help developers quickly get started with building applications using the Anthropic API. Each quickstart provides a foundation that you can easily build upon and customize for your specific needs.

  113. sweepai/sweep โญ 7,496
    Sweep: open-source AI-powered Software Developer for small features and bug fixes.
    ๐Ÿ”— sweep.dev

  114. openlm-research/open_llama โญ 7,434
    OpenLLaMA: An Open Reproduction of LLaMA

  115. e2b-dev/E2B โญ 7,413
    E2B is an open-source infrastructure that allows you to run AI-generated code in secure isolated sandboxes in the cloud
    ๐Ÿ”— e2b.dev/docs

  116. zilliztech/GPTCache โญ 7,362
    Semantic cache for LLMs. Fully integrated with LangChain and llama_index.
    ๐Ÿ”— gptcache.readthedocs.io

  117. bigcode-project/starcoder โญ 7,355
    Home of StarCoder: fine-tuning & inference!

  118. vikhyat/moondream โญ 7,225
    A tiny open-source computer-vision language model designed to run efficiently on edge devices
    ๐Ÿ”— moondream.ai

  119. skypilot-org/skypilot โญ 7,110
    SkyPilot: Run AI and batch jobs on any infra (Kubernetes or 12+ clouds). Get unified execution, cost savings, and high GPU availability via a simple interface.
    ๐Ÿ”— docs.skypilot.co

  120. eleutherai/gpt-neox โญ 7,075
    An implementation of model parallel autoregressive transformers on GPUs, based on the Megatron and DeepSpeed libraries
    ๐Ÿ”— www.eleuther.ai

  121. bhaskatripathi/pdfGPT โญ 7,041
    PDF GPT allows you to chat with the contents of your PDF file by using GPT capabilities. The most effective open source solution to turn your pdf files in a chatbot!
    ๐Ÿ”— huggingface.co/spaces/bhaskartripathi/pdfchatter

  122. apple/corenet โญ 7,000
    CoreNet is a deep neural network toolkit that allows researchers and engineers to train standard and novel small and large-scale models for variety of tasks, including foundation models (e.g., CLIP and LLM), object classification, object detection, and semantic segmentation.

  123. future-house/paper-qa โญ 6,870
    High-accuracy retrieval augmented generation (RAG) on PDFs or text files, with a focus on the scientific literature

  124. mit-han-lab/streaming-llm โญ 6,780
    [ICLR 2024] Efficient Streaming Language Models with Attention Sinks
    ๐Ÿ”— arxiv.org/abs/2309.17453

  125. weaviate/Verba โญ 6,746
    Retrieval Augmented Generation (RAG) chatbot powered by Weaviate

  126. internlm/InternLM โญ 6,734
    Official release of InternLM series (InternLM, InternLM2, InternLM2.5, InternLM3).
    ๐Ÿ”— internlm.intern-ai.org.cn

  127. langchain-ai/opengpts โญ 6,546
    An open source effort to create a similar experience to OpenAI's GPTs and Assistants API.

  128. run-llama/rags โญ 6,376
    RAGs is a Streamlit app that lets you create a RAG pipeline from a data source using natural language.

  129. nat/openplayground โญ 6,318
    An LLM playground you can run on your laptop

  130. lightning-ai/lit-llama โญ 6,027
    Implementation of the LLaMA language model based on nanoGPT. Supports flash attention, Int8 and GPTQ 4bit quantization, LoRA and LLaMA-Adapter fine-tuning, pre-training. Apache 2.0-licensed.

  131. simonw/llm โญ 5,909
    Access large language models from the command-line
    ๐Ÿ”— llm.datasette.io

  132. minedojo/Voyager โญ 5,865
    An Open-Ended Embodied Agent with Large Language Models
    ๐Ÿ”— voyager.minedojo.org

  133. pytorch-labs/gpt-fast โญ 5,774
    Simple and efficient pytorch-native transformer text generation in <1000 LOC of python.

  134. langchain-ai/chat-langchain โญ 5,663
    Locally hosted chatbot specifically focused on question answering over the LangChain documentation
    ๐Ÿ”— chat.langchain.com

  135. lyogavin/airllm โญ 5,640
    AirLLM optimizes inference memory usage, allowing 70B large language models to run inference on a single 4GB GPU card without quantization, distillation and pruning. And you can run 405B Llama3.1 on 8GB vram now.

  136. canner/WrenAI โญ 5,612
    Open-source GenBI AI Agent that empowers data-driven teams to chat with their data to generate Text-to-SQL, charts, spreadsheets, reports, and BI.
    ๐Ÿ”— getwren.ai/oss

  137. microsoft/promptbase โญ 5,520
    promptbase is an evolving collection of resources, best practices, and example scripts for eliciting the best performance from foundation models.

  138. promptfoo/promptfoo โญ 5,410
    Test your prompts, agents, and RAGs. Red teaming, pentesting, and vulnerability scanning for LLMs. Compare performance of GPT, Claude, Gemini, Llama, and more. Simple declarative configs with command line and CI/CD integration.
    ๐Ÿ”— promptfoo.dev

  139. qwenlm/Qwen-VL โญ 5,379
    The official repo of Qwen-VL (้€šไน‰ๅƒ้—ฎ-VL) chat & pretrained large vision language model proposed by Alibaba Cloud.

  140. dsdanielpark/Bard-API โญ 5,298
    The unofficial python package that returns response of Google Bard through cookie value.
    ๐Ÿ”— pypi.org/project/bardapi

  141. modelscope/ms-swift โญ 5,209
    Use PEFT or Full-parameter to finetune 450+ LLMs (Qwen2.5, InternLM3, GLM4, Llama3.3, Mistral, Yi1.5, Baichuan2, DeepSeek-R1, ...) and 150+ MLLMs (Qwen2.5-VL, Qwen2-Audio, Llama3.2-Vision, Llava, InternVL2.5, MiniCPM-V-2.6, GLM4v, Xcomposer2.5, Yi-VL, DeepSeek-VL2, Phi3.5-Vision, GOT-OCR2, ...).
    ๐Ÿ”— swift.readthedocs.io/zh-cn/latest

  142. cg123/mergekit โญ 5,197
    Tools for merging pretrained large language models.

  143. arcee-ai/mergekit โญ 5,197
    Tools for merging pretrained large language models.

  144. allenai/OLMo โญ 5,115
    OLMo is a repository for training and using AI2's state-of-the-art open language models. It is designed by scientists, for scientists.
    ๐Ÿ”— allenai.org/olmo

  145. openbmb/ToolBench โญ 4,867
    [ICLR'24 spotlight] An open platform for training, serving, and evaluating large language model for tool learning.
    ๐Ÿ”— openbmb.github.io/toolbench

  146. microsoft/LLMLingua โญ 4,843
    [EMNLP'23, ACL'24] To speed up LLMs' inference and enhance LLM's perceive of key information, compress the prompt and KV-Cache, which achieves up to 20x compression with minimal performance loss.
    ๐Ÿ”— llmlingua.com

  147. togethercomputer/RedPajama-Data โญ 4,633
    The RedPajama-Data repository contains code for preparing large datasets for training large language models.

  148. open-compass/opencompass โญ 4,579
    OpenCompass is an LLM evaluation platform, supporting a wide range of models (Llama3, Mistral, InternLM2,GPT-4,LLaMa2, Qwen,GLM, Claude, etc) over 100+ datasets.
    ๐Ÿ”— opencompass.org.cn

  149. 1rgs/jsonformer โญ 4,562
    A Bulletproof Way to Generate Structured JSON from Language Models

  150. pipecat-ai/pipecat โญ 4,526
    Open Source framework for voice and multimodal conversational AI

  151. guardrails-ai/guardrails โญ 4,439
    Open-source Python package for specifying structure and type, validating and correcting the outputs of large language models (LLMs)
    ๐Ÿ”— www.guardrailsai.com/docs

  152. kyegomez/tree-of-thoughts โญ 4,435
    Plug in and Play Implementation of Tree of Thoughts: Deliberate Problem Solving with Large Language Models that Elevates Model Reasoning by atleast 70%
    ๐Ÿ”— discord.gg/qutxnk2nmf

  153. nvidia/NeMo-Guardrails โญ 4,383
    NeMo Guardrails is an open-source toolkit for easily adding programmable guardrails to LLM-based conversational systems.

  154. microsoft/BioGPT โญ 4,354
    Implementation of BioGPT: Generative Pre-trained Transformer for Biomedical Text Generation and Mining

  155. linkedin/Liger-Kernel โญ 4,288
    Efficient Triton Kernels for LLM Training
    ๐Ÿ”— arxiv.org/pdf/2410.10989

  156. instruction-tuning-with-gpt-4/GPT-4-LLM โญ 4,259
    Instruction Tuning with GPT-4
    ๐Ÿ”— instruction-tuning-with-gpt-4.github.io

  157. yizhongw/self-instruct โญ 4,254
    Aligning pretrained language models with instruction data generated by themselves.

  158. katanaml/sparrow โญ 4,198
    Sparrow is a solution for efficient data extraction and processing from various documents and images like invoices and receipts
    ๐Ÿ”— sparrow.katanaml.io

  159. h2oai/h2o-llmstudio โญ 4,150
    H2O LLM Studio - a framework and no-code GUI for fine-tuning LLMs. Documentation: https://docs.h2o.ai/h2o-llmstudio/
    ๐Ÿ”— h2o.ai

  160. mshumer/gpt-llm-trainer โญ 4,071
    Input a description of your task, and the system will generate a dataset, parse it, and fine-tune a LLaMA 2 model for you

  161. ragapp/ragapp โญ 4,011
    The easiest way to use Agentic RAG in any enterprise

  162. turboderp/exllamav2 โญ 3,915
    A fast inference library for running LLMs locally on modern consumer-class GPUs

  163. microsoft/LMOps โญ 3,824
    General technology for enabling AI capabilities w/ LLMs and MLLMs
    ๐Ÿ”— aka.ms/generalai

  164. ravenscroftj/turbopilot โญ 3,818
    Turbopilot is an open source large-language-model based code completion engine that runs locally on CPU

  165. eth-sri/lmql โญ 3,793
    A language for constraint-guided and efficient LLM programming.
    ๐Ÿ”— lmql.ai

  166. agiresearch/AIOS โญ 3,773
    AIOS, a Large Language Model (LLM) Agent operating system, embeds large language model into Operating Systems (OS) as the brain of the OS, enabling an operating system "with soul" -- an important step towards AGI.
    ๐Ÿ”— aios.foundation

  167. mmabrouk/llm-workflow-engine โญ 3,679
    Power CLI and Workflow manager for LLMs (core package)

  168. truefoundry/cognita โญ 3,558
    RAG (Retrieval Augmented Generation) Framework for building modular, open source applications for production by TrueFoundry
    ๐Ÿ”— cognita.truefoundry.com

  169. defog-ai/sqlcoder โญ 3,528
    SoTA LLM for converting natural language questions to SQL queries

  170. lm-sys/RouteLLM โญ 3,519
    A framework for serving and evaluating LLM routers - save LLM costs without compromising quality

  171. marker-inc-korea/AutoRAG โญ 3,515
    AutoRAG: An Open-Source Framework for Retrieval-Augmented Generation (RAG) Evaluation & Optimization with AutoML-Style Automation
    ๐Ÿ”— auto-rag.com

  172. minimaxir/simpleaichat โญ 3,499
    Python package for easily interfacing with chat apps, with robust features and minimal code complexity.

  173. iryna-kondr/scikit-llm โญ 3,411
    Seamlessly integrate LLMs into scikit-learn.
    ๐Ÿ”— beastbyte.ai

  174. next-gpt/NExT-GPT โญ 3,405
    Code and models for NExT-GPT: Any-to-Any Multimodal Large Language Model
    ๐Ÿ”— next-gpt.github.io

  175. minimaxir/gpt-2-simple โญ 3,400
    Python package to easily retrain OpenAI's GPT-2 text-generating model on new texts

  176. jaymody/picoGPT โญ 3,303
    An unnecessarily tiny implementation of GPT-2 in NumPy.

  177. deep-diver/LLM-As-Chatbot โญ 3,299
    LLM as a Chatbot Service

  178. luodian/Otter โญ 3,228
    ๐Ÿฆฆ Otter, a multi-modal model based on OpenFlamingo (open-sourced version of DeepMind's Flamingo), trained on MIMIC-IT and showcasing improved instruction-following and in-context learning ability.
    ๐Ÿ”— otter-ntu.github.io

  179. bclavie/RAGatouille โญ 3,214
    Bridging the gap between state-of-the-art research and alchemical RAG pipeline practices.

  180. huggingface/text-embeddings-inference โญ 3,105
    A blazing fast inference solution for text embeddings models
    ๐Ÿ”— huggingface.co/docs/text-embeddings-inference/quick_tour

  181. microsoft/torchscale โญ 3,044
    Foundation Architecture for (M)LLMs
    ๐Ÿ”— aka.ms/generalai

  182. baichuan-inc/Baichuan-13B โญ 2,973
    A 13B large language model developed by Baichuan Intelligent Technology
    ๐Ÿ”— huggingface.co/baichuan-inc/baichuan-13b-chat

  183. li-plus/chatglm.cpp โญ 2,964
    C++ implementation of ChatGLM-6B & ChatGLM2-6B & ChatGLM3 & GLM4(V)

  184. cohere-ai/cohere-toolkit โญ 2,934
    Cohere Toolkit is a collection of prebuilt components enabling users to quickly build and deploy RAG applications.

  185. freedomintelligence/LLMZoo โญ 2,928
    โšกLLM Zoo is a project that provides data, models, and evaluation benchmark for large language models.โšก

  186. verazuo/jailbreak_llms โญ 2,917
    Official repo for the ACM CCS 2024 paper "Do Anything Now'': Characterizing and Evaluating In-The-Wild Jailbreak Prompts
    ๐Ÿ”— jailbreak-llms.xinyueshen.me

  187. meta-llama/PurpleLlama โญ 2,870
    Set of tools to assess and improve LLM security.

  188. mistralai/mistral-finetune โญ 2,825
    A light-weight codebase that enables memory-efficient and performant finetuning of Mistral's models. It is based on LoRA.

  189. lightning-ai/LitServe โญ 2,812
    Lightning-fast serving engine for any AI model of any size. Flexible. Easy. Enterprise-scale.
    ๐Ÿ”— lightning.ai/docs/litserve

  190. juncongmoo/pyllama โญ 2,806
    LLaMA: Open and Efficient Foundation Language Models

  191. hegelai/prompttools โญ 2,779
    Open-source tools for prompt testing and experimentation, with support for both LLMs (e.g. OpenAI, LLaMA) and vector databases (e.g. Chroma, Weaviate, LanceDB).
    ๐Ÿ”— prompttools.readthedocs.io

  192. alpha-vllm/LLaMA2-Accessory โญ 2,750
    An Open-source Toolkit for LLM Development
    ๐Ÿ”— llama2-accessory.readthedocs.io

  193. mit-han-lab/llm-awq โญ 2,703
    AWQ: Activation-aware Weight Quantization for LLM Compression and Acceleration

  194. paperswithcode/galai โญ 2,702
    Model API for GALACTICA

  195. nirdiamant/Prompt_Engineering โญ 2,695
    A comprehensive collection of tutorials and implementations for Prompt Engineering techniques, ranging from fundamental concepts to advanced strategies.

  196. sylphai-inc/AdalFlow โญ 2,650
    Unified auto-differentiative framework for both zero-shot prompt optimization and few-shot optimization. It advances existing auto-optimization research, including Text-Grad and DsPy
    ๐Ÿ”— adalflow.sylph.ai

  197. cheshire-cat-ai/core โญ 2,607
    AI agent microservice
    ๐Ÿ”— cheshirecat.ai

  198. noahshinn/reflexion โญ 2,567
    [NeurIPS 2023] Reflexion: Language Agents with Verbal Reinforcement Learning

  199. databricks/dbrx โญ 2,531
    Code examples and resources for DBRX, a large language model developed by Databricks
    ๐Ÿ”— www.databricks.com

  200. pytorch/executorch โญ 2,461
    An end-to-end solution for enabling on-device inference capabilities across mobile and edge devices including wearables, embedded devices and microcontrollers. It is part of the PyTorch Edge ecosystem and enables efficient deployment of PyTorch models to edge devices.
    ๐Ÿ”— pytorch.org/executorch

  201. ofa-sys/OFA โญ 2,457
    Official repository of OFA (ICML 2022). Paper: OFA: Unifying Architectures, Tasks, and Modalities Through a Simple Sequence-to-Sequence Learning Framework

  202. young-geng/EasyLM โญ 2,448
    Large language models (LLMs) made easy, EasyLM is a one stop solution for pre-training, finetuning, evaluating and serving LLMs in JAX/Flax.

  203. janhq/cortex.cpp โญ 2,429
    Cortex is a Local AI API Platform that is used to run and customize LLMs.
    ๐Ÿ”— cortex.so

  204. novasky-ai/SkyThought โญ 2,353
    Sky-T1: Train your own O1 preview model within $450
    ๐Ÿ”— novasky-ai.github.io

  205. civitai/sd_civitai_extension โญ 2,350
    All of the Civitai models inside Automatic 1111 Stable Diffusion Web UI

  206. predibase/lorax โญ 2,333
    Multi-LoRA inference server that scales to 1000s of fine-tuned LLMs
    ๐Ÿ”— loraexchange.ai

  207. intel/neural-compressor โญ 2,312
    SOTA low-bit LLM quantization (INT8/FP8/INT4/FP4/NF4) & sparsity; leading model compression techniques on TensorFlow, PyTorch, and ONNX Runtime
    ๐Ÿ”— intel.github.io/neural-compressor

  208. truera/trulens โญ 2,307
    Evaluation and Tracking for LLM Experiments
    ๐Ÿ”— www.trulens.org

  209. spcl/graph-of-thoughts โญ 2,266
    Official Implementation of "Graph of Thoughts: Solving Elaborate Problems with Large Language Models"
    ๐Ÿ”— arxiv.org/pdf/2308.09687.pdf

  210. openai/simple-evals โญ 2,260
    Lightweight library for evaluating language models

  211. argilla-io/distilabel โญ 2,227
    Distilabel is the framework for synthetic data and AI feedback for engineers who need fast, reliable and scalable pipelines based on verified research papers.
    ๐Ÿ”— distilabel.argilla.io

  212. openai/finetune-transformer-lm โญ 2,185
    Code and model for the paper "Improving Language Understanding by Generative Pre-Training"
    ๐Ÿ”— s3-us-west-2.amazonaws.com/openai-assets/research-covers/language-unsupervised/language_understanding_paper.pdf

  213. volcengine/verl โญ 2,116
    veRL is a flexible, efficient and production-ready RL training library for large language models (LLMs).
    ๐Ÿ”— verl.readthedocs.io/en/latest/index.html

  214. tairov/llama2.mojo โญ 2,108
    Inference Llama 2 in one file of pure ๐Ÿ”ฅ
    ๐Ÿ”— www.modular.com/blog/community-spotlight-how-i-built-llama2-by-aydyn-tairov

  215. azure-samples/graphrag-accelerator โญ 2,101
    One-click deploy of a Knowledge Graph powered RAG (GraphRAG) in Azure
    ๐Ÿ”— github.com/microsoft/graphrag

  216. evolvinglmms-lab/lmms-eval โญ 2,051
    Accelerating the development of large multimodal models (LMMs) with one-click evaluation module - lmms-eval.
    ๐Ÿ”— lmms-lab.framer.ai

  217. openai/image-gpt โญ 2,049
    Archived. Code and models from the paper "Generative Pretraining from Pixels"

  218. agenta-ai/agenta โญ 2,036
    The open-source LLMOps platform: prompt playground, prompt management, LLM evaluation, and LLM Observability all in one place.
    ๐Ÿ”— www.agenta.ai

  219. ist-daslab/gptq โญ 2,016
    Code for the ICLR 2023 paper "GPTQ: Accurate Post-training Quantization of Generative Pretrained Transformers".
    ๐Ÿ”— arxiv.org/abs/2210.17323

  220. lucidrains/toolformer-pytorch โญ 1,998
    Implementation of Toolformer, Language Models That Can Use Tools, by MetaAI

  221. neulab/prompt2model โญ 1,981
    prompt2model - Generate Deployable Models from Natural Language Instructions

  222. microsoft/Megatron-DeepSpeed โญ 1,972
    Ongoing research training transformer language models at scale, including: BERT & GPT-2

  223. openai/gpt-2-output-dataset โญ 1,958
    Dataset of GPT-2 outputs for research in detection, biases, and more

  224. akariasai/self-rag โญ 1,952
    This includes the original implementation of SELF-RAG: Learning to Retrieve, Generate and Critique through self-reflection by Akari Asai, Zeqiu Wu, Yizhong Wang, Avirup Sil, and Hannaneh Hajishirzi.
    ๐Ÿ”— selfrag.github.io

  225. epfllm/meditron โญ 1,945
    Meditron is a suite of open-source medical Large Language Models (LLMs).
    ๐Ÿ”— huggingface.co/epfl-llm

  226. flashinfer-ai/flashinfer โญ 1,926
    FlashInfer is a library and kernel generator for Large Language Models that provides high-performance implementation of LLM GPU kernels such as FlashAttention, SparseAttention, PageAttention, Sampling
    ๐Ÿ”— flashinfer.ai

  227. casper-hansen/AutoAWQ โญ 1,924
    AutoAWQ implements the AWQ algorithm for 4-bit quantization with a 2x speedup during inference. Documentation:
    ๐Ÿ”— casper-hansen.github.io/autoawq

  228. facebookresearch/chameleon โญ 1,916
    Repository for Meta Chameleon, a mixed-modal early-fusion foundation model from FAIR.
    ๐Ÿ”— arxiv.org/abs/2405.09818

  229. facebookresearch/large_concept_model โญ 1,860
    Large Concept Models: Language modeling in a sentence representation space

  230. minimaxir/aitextgen โญ 1,843
    A robust Python tool for text-based AI training and generation using GPT-2.
    ๐Ÿ”— docs.aitextgen.io

  231. openai/gpt-discord-bot โญ 1,801
    Example Discord bot written in Python that uses the completions API to have conversations with the text-davinci-003 model, and the moderations API to filter the messages.

  232. huggingface/smollm โญ 1,785
    Everything about the SmolLM2 and SmolVLM family of models
    ๐Ÿ”— huggingface.co/huggingfacetb

  233. ray-project/llm-applications โญ 1,753
    A comprehensive guide to building RAG-based LLM applications for production.

  234. ruc-nlpir/FlashRAG โญ 1,715
    FlashRAG is a Python toolkit for the reproduction and development of RAG research. Our toolkit includes 36 pre-processed benchmark RAG datasets and 15 state-of-the-art RAG algorithms.
    ๐Ÿ”— arxiv.org/abs/2405.13576

  235. noamgat/lm-format-enforcer โญ 1,687
    Enforce the output format (JSON Schema, Regex etc) of a language model

  236. modelcontextprotocol/python-sdk โญ 1,636
    The Model Context Protocol allows applications to provide context for LLMs in a standardized way, separating the concerns of providing context from the actual LLM interaction.
    ๐Ÿ”— modelcontextprotocol.io

  237. qwenlm/Qwen-Audio โญ 1,575
    The official repo of Qwen-Audio (้€šไน‰ๅƒ้—ฎ-Audio) chat & pretrained large audio language model proposed by Alibaba Cloud.

  238. jina-ai/thinkgpt โญ 1,565
    Agent techniques to augment your LLM and push it beyong its limits

  239. agentops-ai/tokencost โญ 1,546
    Easy token price estimates for 400+ LLMs. TokenOps.
    ๐Ÿ”— agentops.ai

  240. meetkai/functionary โญ 1,508
    Chat language model that can use tools and interpret the results

  241. deep-agent/R1-V โญ 1,481
    We are building a general framework for Reinforcement Learning with Verifiable Rewards (RLVR) in VLM. RLVR outperforms chain-of-thought supervised fine-tuning (CoT-SFT) in both effectiveness and out-of-distribution (OOD) robustness for vision language models.

  242. roboflow/maestro โญ 1,460
    streamline the fine-tuning process for multimodal models: PaliGemma, Florence-2, and Qwen2-VL
    ๐Ÿ”— maestro.roboflow.com

  243. cstankonrad/long_llama โญ 1,448
    LongLLaMA is a large language model capable of handling long contexts. It is based on OpenLLaMA and fine-tuned with the Focused Transformer (FoT) method.

  244. farizrahman4u/loopgpt โญ 1,443
    Re-implementation of Auto-GPT as a python package, written with modularity and extensibility in mind.

  245. run-llama/llama-lab โญ 1,440
    Llama Lab is a repo dedicated to building cutting-edge projects using LlamaIndex

  246. huggingface/nanotron โญ 1,414
    Minimalistic large language model 3D-parallelism training

  247. chatarena/chatarena โญ 1,406
    ChatArena (or Chat Arena) is a Multi-Agent Language Game Environments for LLMs. The goal is to develop communication and collaboration capabilities of AIs.

  248. explosion/spacy-transformers โญ 1,360
    ๐Ÿ›ธ Use pretrained transformers like BERT, XLNet and GPT-2 in spaCy
    ๐Ÿ”— spacy.io/usage/embeddings-transformers

  249. bigscience-workshop/Megatron-DeepSpeed โญ 1,358
    Ongoing research training transformer language models at scale, including: BERT & GPT-2

  250. karpathy/nano-llama31 โญ 1,303
    This repo is to Llama 3.1 what nanoGPT is to GPT-2. i.e. it is a minimal, dependency-free implementation of the Llama 3.1 architecture

  251. answerdotai/rerankers โญ 1,266
    Welcome to rerankers! Our goal is to provide users with a simple API to use any reranking models.

  252. ray-project/ray-llm โญ 1,252
    RayLLM - LLMs on Ray
    ๐Ÿ”— aviary.anyscale.com

  253. facebookresearch/MobileLLM โญ 1,233
    Training code of MobileLLM introduced in our work: "MobileLLM: Optimizing Sub-billion Parameter Language Models for On-Device Use Cases"

  254. srush/MiniChain โญ 1,220
    A tiny library for coding with large language models.
    ๐Ÿ”— srush-minichain.hf.space

  255. mlfoundations/dclm โญ 1,213
    DataComp for Language Models

  256. keirp/automatic_prompt_engineer โญ 1,205
    Large Language Models Are Human-Level Prompt Engineers

  257. hao-ai-lab/LookaheadDecoding โญ 1,184
    Break the Sequential Dependency of LLM Inference Using Lookahead Decoding
    ๐Ÿ”— arxiv.org/abs/2402.02057

  258. explosion/spacy-llm โญ 1,180
    ๐Ÿฆ™ Integrating LLMs into structured NLP pipelines
    ๐Ÿ”— spacy.io/usage/large-language-models

  259. ibm/Dromedary โญ 1,133
    Dromedary: towards helpful, ethical and reliable LLMs.

  260. topoteretes/cognee โญ 1,108
    Reliable LLM Memory for AI Applications and AI Agents
    ๐Ÿ”— www.cognee.ai

  261. lupantech/chameleon-llm โญ 1,108
    Codes for "Chameleon: Plug-and-Play Compositional Reasoning with Large Language Models".
    ๐Ÿ”— chameleon-llm.github.io

  262. rlancemartin/auto-evaluator โญ 1,067
    Evaluation tool for LLM QA chains
    ๐Ÿ”— autoevaluator.langchain.com

  263. huggingface/lighteval โญ 1,061
    LightEval is a lightweight LLM evaluation suite that Hugging Face has been using internally with the recently released LLM data processing library datatrove and LLM training library nanotron.

  264. ctlllll/LLM-ToolMaker โญ 1,025
    Large Language Models as Tool Makers

  265. microsoft/Llama-2-Onnx โญ 1,024
    A Microsoft optimized version of the Llama 2 model, available from Meta

  266. nomic-ai/pygpt4all โญ 1,020
    Official supported Python bindings for llama.cpp + gpt4all
    ๐Ÿ”— nomic-ai.github.io/pygpt4all

  267. cerebras/modelzoo โญ 1,011
    Examples of common deep learning models that can be trained on Cerebras hardware

  268. nirdiamant/Controllable-RAG-Agent โญ 1,010
    An advanced Retrieval-Augmented Generation (RAG) solution designed to tackle complex questions that simple semantic similarity-based retrieval cannot solve

  269. minishlab/model2vec โญ 1,006
    Model2Vec is a technique to turn any sentence transformer into a really small static model, reducing model size by 15x and making the models up to 500x faster, with a small drop in performance
    ๐Ÿ”— minishlab.github.io

  270. pinecone-io/canopy โญ 998
    Retrieval Augmented Generation (RAG) framework and context engine powered by Pinecone
    ๐Ÿ”— www.pinecone.io

  271. ajndkr/lanarky โญ 986
    The web framework for building LLM microservices
    ๐Ÿ”— lanarky.ajndkr.com

  272. huggingface/evaluation-guidebook โญ 981
    Sharing both practical insights and theoretical knowledge about LLM evaluation that we gathered while managing the Open LLM Leaderboard and designing lighteval!

  273. likejazz/llama3.np โญ 972
    llama3.np is a pure NumPy implementation for Llama 3 model.

  274. datadreamer-dev/DataDreamer โญ 949
    DataDreamer is a powerful open-source Python library for prompting, synthetic data generation, and training workflows. It is designed to be simple, extremely efficient, and research-grade.
    ๐Ÿ”— datadreamer.dev

  275. huggingface/optimum-nvidia โญ 927
    Optimum-NVIDIA delivers the best inference performance on the NVIDIA platform through Hugging Face. Run LLaMA 2 at 1,200 tokens/second (up to 28x faster than the framework)

  276. soulter/hugging-chat-api โญ 898
    HuggingChat Python API๐Ÿค—

  277. muennighoff/sgpt โญ 861
    SGPT: GPT Sentence Embeddings for Semantic Search
    ๐Ÿ”— arxiv.org/abs/2202.08904

  278. prometheus-eval/prometheus-eval โญ 855
    Evaluate your LLM's response with Prometheus and GPT4 ๐Ÿ’ฏ

  279. langchain-ai/langsmith-cookbook โญ 833
    LangSmith is a platform for building production-grade LLM applications.
    ๐Ÿ”— langsmith-cookbook.vercel.app

  280. wandb/weave โญ 797
    Weave is a toolkit for developing AI-powered applications, built by Weights & Biases.
    ๐Ÿ”— wandb.me/weave

  281. junruxiong/IncarnaMind โญ 789
    Connect and chat with your multiple documents (pdf and txt) through GPT 3.5, GPT-4 Turbo, Claude and Local Open-Source LLMs
    ๐Ÿ”— www.incarnamind.com

  282. nousresearch/Hermes-Function-Calling โญ 788
    Code for the Hermes Pro Large Language Model to perform function calling based on the provided schema. It allows users to query the model and retrieve information related to stock prices, company fundamentals, financial statements

  283. oliveirabruno01/babyagi-asi โญ 788
    BabyAGI: an Autonomous and Self-Improving agent, or BASI

  284. opengvlab/OmniQuant โญ 759
    [ICLR2024 spotlight] OmniQuant is a simple and powerful quantization technique for LLMs.

  285. opengenerativeai/GenossGPT โญ 751
    One API for all LLMs either Private or Public (Anthropic, Llama V2, GPT 3.5/4, Vertex, GPT4ALL, HuggingFace ...) ๐ŸŒˆ๐Ÿ‚ Replace OpenAI GPT with any LLMs in your app with one line.
    ๐Ÿ”— genoss.ai

  286. utkusen/promptmap โญ 726
    Vulnerability scanning tool that automatically tests prompt injection attacks on your LLM applications. It analyzes your LLM system prompts, runs them, and sends attack prompts to them.

  287. salesforce/xgen โญ 717
    Salesforce open-source LLMs with 8k sequence length.

  288. squeezeailab/SqueezeLLM โญ 671
    [ICML 2024] SqueezeLLM: Dense-and-Sparse Quantization
    ๐Ÿ”— arxiv.org/abs/2306.07629

  289. tag-research/TAG-Bench โญ 661
    Table-Augmented Generation (TAG) is a unified and general-purpose paradigm for answering natural language questions over databases
    ๐Ÿ”— arxiv.org/pdf/2408.14717

  290. mlc-ai/xgrammar โญ 639
    XGrammar is an open-source library for efficient, flexible, and portable structured generation. It supports general context-free grammar to enable a broad range of structures while bringing careful system optimizations to enable fast executions.
    ๐Ÿ”— xgrammar.mlc.ai

  291. lupantech/ScienceQA โญ 627
    Data and code for NeurIPS 2022 Paper "Learn to Explain: Multimodal Reasoning via Thought Chains for Science Question Answering".

  292. developersdigest/llm-api-engine โญ 622
    Build and deploy AI-powered APIs in seconds. This project allows you to create custom APIs that extract structured data from websites using natural language descriptions, powered by LLMs and web scraping technology.
    ๐Ÿ”— www.youtube.com/watch?v=8kuek1bo4mm

  293. tsinghuadatabasegroup/DB-GPT โญ 597
    LLM As Database Administrator
    ๐Ÿ”— dbgpt.dbmind.cn

  294. microsoft/VPTQ โญ 575
    Extreme Low-bit Vector Post-Training Quantization for Large Language Models

  295. zhudotexe/kani โญ 570
    kani (ใ‚ซใƒ‹) is a highly hackable microframework for chat-based language models with tool use/function calling. (NLP-OSS @ EMNLP 2023)
    ๐Ÿ”— kani.readthedocs.io

  296. modal-labs/llm-finetuning โญ 564
    Guide for fine-tuning Llama/Mistral/CodeLlama models and more

  297. magnivorg/prompt-layer-library โญ 544
    ๐Ÿฐ PromptLayer - Maintain a log of your prompts and OpenAI API requests. Track, debug, and replay old completions.
    ๐Ÿ”— www.promptlayer.com

  298. hazyresearch/ama_prompting โญ 544
    Ask Me Anything language model prompting

  299. declare-lab/instruct-eval โญ 540
    This repository contains code to quantitatively evaluate instruction-tuned models such as Alpaca and Flan-T5 on held-out tasks.
    ๐Ÿ”— declare-lab.github.io/instruct-eval

  300. vahe1994/SpQR โญ 538
    Quantization algorithm and the model evaluation code for SpQR method for LLM compression

  301. eugeneyan/obsidian-copilot โญ 523
    ๐Ÿค– A prototype assistant for writing and thinking
    ๐Ÿ”— eugeneyan.com/writing/obsidian-copilot

  302. continuum-llms/chatgpt-memory โญ 523
    Allows to scale the ChatGPT API to multiple simultaneous sessions with infinite contextual and adaptive memory powered by GPT and Redis datastore.

  303. judahpaul16/gpt-home โญ 519
    ChatGPT at home! Basically a better Google Nest Hub or Amazon Alexa home assistant. Built on the Raspberry Pi using the OpenAI API.
    ๐Ÿ”— hub.docker.com/r/judahpaul/gpt-home

  304. hazyresearch/H3 โญ 516
    Language Modeling with the H3 State Space Model

  305. kbressem/medAlpaca โญ 504
    LLM finetuned for medical question answering

  306. huggingface/text-clustering โญ 495
    Easily embed, cluster and semantically label text datasets

  307. stanford-oval/suql โญ 241
    SUQL: Conversational Search over Structured and Unstructured Data with LLMs
    ๐Ÿ”— arxiv.org/abs/2311.09818

  308. dottxt-ai/outlines-core โญ 173
    Core functionality for structured generation, formerly implemented in Outlines, with a focus on performance and portability.

  309. prithivirajdamodaran/Route0x โญ 92
    A production-grade query routing solution, leveraging LLMs while optimizing for cost per query

  310. whitead/paper-qa โญ 3
    High accuracy RAG for answering questions from scientific documents with citations

Math and Science

Mathematical, numerical and scientific libraries.

  1. numpy/numpy โญ 28,718
    The fundamental package for scientific computing with Python.
    ๐Ÿ”— numpy.org

  2. camdavidsonpilon/Probabilistic-Programming-and-Bayesian-Methods-for-Hackers โญ 27,158
    aka "Bayesian Methods for Hackers": An introduction to Bayesian methods + probabilistic programming with a computation/understanding-first, mathematics-second point of view. All in pure Python ;)
    ๐Ÿ”— camdavidsonpilon.github.io/probabilistic-programming-and-bayesian-methods-for-hackers

  3. taichi-dev/taichi โญ 26,664
    Productive, portable, and performant GPU programming in Python: Taichi Lang is an open-source, imperative, parallel programming language for high-performance numerical computation.
    ๐Ÿ”— taichi-lang.org

  4. experience-monks/math-as-code โญ 15,277
    This is a reference to ease developers into mathematical notation by showing comparisons with Python code

  5. scipy/scipy โญ 13,347
    SciPy library main repository
    ๐Ÿ”— scipy.org

  6. sympy/sympy โญ 13,272
    A computer algebra system written in pure Python
    ๐Ÿ”— sympy.org

  7. google/or-tools โญ 11,581
    Google Optimization Tools (a.k.a., OR-Tools) is an open-source, fast and portable software suite for solving combinatorial optimization problems.
    ๐Ÿ”— developers.google.com/optimization

  8. z3prover/z3 โญ 10,612
    Z3 is a theorem prover from Microsoft Research with a Python language binding.

  9. cupy/cupy โญ 9,737
    NumPy & SciPy for GPU
    ๐Ÿ”— cupy.dev

  10. google-deepmind/alphageometry โญ 4,259
    Solving Olympiad Geometry without Human Demonstrations

  11. pim-book/programmers-introduction-to-mathematics โญ 3,555
    Code for A Programmer's Introduction to Mathematics
    ๐Ÿ”— pimbook.org

  12. mikedh/trimesh โญ 3,119
    Python library for loading and using triangular meshes.
    ๐Ÿ”— trimesh.org

  13. talalalrawajfeh/mathematics-roadmap โญ 2,787
    A Comprehensive Roadmap to Mathematics

  14. pyro-ppl/numpyro โญ 2,381
    Probabilistic programming with NumPy powered by JAX for autograd and JIT compilation to GPU/TPU/CPU.
    ๐Ÿ”— num.pyro.ai

  15. mckinsey/causalnex โญ 2,274
    A Python library that helps data scientists to infer causation rather than observing correlation.
    ๐Ÿ”— causalnex.readthedocs.io

  16. pyomo/pyomo โญ 2,100
    An object-oriented algebraic modeling language in Python for structured optimization problems.
    ๐Ÿ”— www.pyomo.org

  17. facebookresearch/theseus โญ 1,826
    A library for differentiable nonlinear optimization

  18. arviz-devs/arviz โญ 1,644
    Exploratory analysis of Bayesian models with Python
    ๐Ÿ”— python.arviz.org

  19. google-research/torchsde โญ 1,600
    Differentiable SDE solvers with GPU support and efficient sensitivity analysis.

  20. dynamicslab/pysindy โญ 1,510
    A package for the sparse identification of nonlinear dynamical systems from data
    ๐Ÿ”— pysindy.readthedocs.io/en/latest

  21. geomstats/geomstats โญ 1,289
    Computations and statistics on manifolds with geometric structures.
    ๐Ÿ”— geomstats.ai

  22. cma-es/pycma โญ 1,139
    pycma is a Python implementation of CMA-ES and a few related numerical optimization tools.

  23. pymc-labs/CausalPy โญ 945
    A Python package for causal inference in quasi-experimental settings
    ๐Ÿ”— causalpy.readthedocs.io

  24. sj001/AI-Feynman โญ 661
    Implementation of AI Feynman: a Physics-Inspired Method for Symbolic Regression

  25. willianfuks/tfcausalimpact โญ 627
    Python Causal Impact Implementation Based on Google's R Package. Built using TensorFlow Probability.

  26. lean-dojo/LeanDojo โญ 606
    Tool for data extraction and interacting with Lean programmatically.
    ๐Ÿ”— leandojo.org

  27. brandondube/prysm โญ 280
    Prysm is an open-source library for physical and first-order modeling of optical systems and analysis of related data: numerical and physical optics, integrated modeling, phase retrieval, segmented systems, polynomials and fitting, sequential raytracing.
    ๐Ÿ”— prysm.readthedocs.io/en/stable

  28. lean-dojo/ReProver โญ 246
    Retrieval-Augmented Theorem Provers for Lean
    ๐Ÿ”— leandojo.org

  29. albahnsen/pycircular โญ 104
    pycircular is a Python module for circular data analysis

  30. gbillotey/Fractalshades โญ 28
    Arbitrary-precision fractal explorer - Python package

Machine Learning - General

General and classical machine learning libraries. See below for other sections covering specialised ML areas.

  1. openai/openai-cookbook โญ 61,497
    Examples and guides for using the OpenAI API
    ๐Ÿ”— cookbook.openai.com

  2. scikit-learn/scikit-learn โญ 60,964
    scikit-learn: machine learning in Python
    ๐Ÿ”— scikit-learn.org

  3. suno-ai/bark โญ 36,830
    ๐Ÿ”Š Text-Prompted Generative Audio Model

  4. tencentarc/GFPGAN โญ 36,245
    GFPGAN aims at developing Practical Algorithms for Real-world Face Restoration.

  5. google-research/google-research โญ 34,815
    This repository contains code released by Google Research
    ๐Ÿ”— research.google

  6. facebookresearch/faiss โญ 32,737
    A library for efficient similarity search and clustering of dense vectors.
    ๐Ÿ”— faiss.ai

  7. google/jax โญ 31,147
    Composable transformations of Python+NumPy programs: differentiate, vectorize, JIT to GPU/TPU, and more
    ๐Ÿ”— jax.readthedocs.io

  8. open-mmlab/mmdetection โญ 30,143
    OpenMMLab Detection Toolbox and Benchmark
    ๐Ÿ”— mmdetection.readthedocs.io

  9. lutzroeder/netron โญ 29,254
    Visualizer for neural network, deep learning and machine learning models
    ๐Ÿ”— netron.app

  10. google/mediapipe โญ 28,457
    Cross-platform, customizable ML solutions for live and streaming media.
    ๐Ÿ”— ai.google.dev/edge/mediapipe

  11. ageron/handson-ml2 โญ 28,334
    A series of Jupyter notebooks that walk you through the fundamentals of Machine Learning and Deep Learning in Python using Scikit-Learn, Keras and TensorFlow 2.

  12. dmlc/xgboost โญ 26,551
    Scalable, Portable and Distributed Gradient Boosting (GBDT, GBRT or GBM) Library, for Python, R, Java, Scala, C++ and more. Runs on single machine, Hadoop, Spark, Dask, Flink and DataFlow
    ๐Ÿ”— xgboost.readthedocs.io/en/stable

  13. roboflow/supervision โญ 24,769
    We write your reusable computer vision tools. ๐Ÿ’œ
    ๐Ÿ”— supervision.roboflow.com

  14. harisiqbal88/PlotNeuralNet โญ 22,657
    Latex code for making neural networks diagrams

  15. jina-ai/serve โญ 21,274
    โ˜๏ธ Build multimodal AI applications with cloud-native stack
    ๐Ÿ”— jina.ai/serve

  16. ml-explore/mlx โญ 18,811
    MLX is an array framework for machine learning on Apple silicon, brought to you by Apple machine learning research.
    ๐Ÿ”— ml-explore.github.io/mlx

  17. onnx/onnx โญ 18,347
    Open standard for machine learning interoperability
    ๐Ÿ”— onnx.ai

  18. microsoft/LightGBM โญ 16,932
    A fast, distributed, high performance gradient boosting (GBT, GBDT, GBRT, GBM or MART) framework based on decision tree algorithms, used for ranking, classification and many other machine learning tasks.
    ๐Ÿ”— lightgbm.readthedocs.io/en/latest

  19. ddbourgin/numpy-ml โญ 15,889
    Machine learning, in numpy
    ๐Ÿ”— numpy-ml.readthedocs.io

  20. tensorflow/tensor2tensor โญ 15,781
    Library of deep learning models and datasets designed to make deep learning more accessible and accelerate ML research.

  21. microsoft/onnxruntime โญ 15,466
    ONNX Runtime: cross-platform, high performance ML inferencing and training accelerator
    ๐Ÿ”— onnxruntime.ai

  22. aleju/imgaug โญ 14,500
    Image augmentation for machine learning experiments.
    ๐Ÿ”— imgaug.readthedocs.io

  23. microsoft/nni โญ 14,109
    An open source AutoML toolkit for automate machine learning lifecycle, including feature engineering, neural architecture search, model compression and hyper-parameter tuning.
    ๐Ÿ”— nni.readthedocs.io

  24. jindongwang/transferlearning โญ 13,679
    Transfer learning / domain adaptation / domain generalization / multi-task learning etc. Papers, codes, datasets, applications, tutorials.-่ฟ็งปๅญฆไน 
    ๐Ÿ”— transferlearning.xyz

  25. neonbjb/tortoise-tts โญ 13,624
    A multi-voice TTS system trained with an emphasis on quality

  26. spotify/annoy โญ 13,463
    Approximate Nearest Neighbors in C++/Python optimized for memory usage and loading/saving to disk

  27. deepmind/deepmind-research โญ 13,459
    This repository contains implementations and illustrative code to accompany DeepMind publications

  28. deepmind/alphafold โญ 13,131
    Implementation of the inference pipeline of AlphaFold v2

  29. facebookresearch/AnimatedDrawings โญ 12,208
    Code to accompany "A Method for Animating Children's Drawings of the Human Figure"

  30. ggerganov/ggml โญ 11,724
    Tensor library for machine learning

  31. optuna/optuna โญ 11,303
    A hyperparameter optimization framework
    ๐Ÿ”— optuna.org

  32. google-gemini/cookbook โญ 10,568
    A collection of guides and examples for the Gemini API, including quickstart tutorials for writing prompts.
    ๐Ÿ”— ai.google.dev/gemini-api/docs

  33. thudm/CogVideo โญ 10,499
    text and image to video generation: CogVideoX (2024) and CogVideo (ICLR 2023)

  34. statsmodels/statsmodels โญ 10,408
    Statsmodels: statistical modeling and econometrics in Python
    ๐Ÿ”— www.statsmodels.org/devel

  35. twitter/the-algorithm-ml โญ 10,178
    Source code for Twitter's Recommendation Algorithm
    ๐Ÿ”— blog.twitter.com/engineering/en_us/topics/open-source/2023/twitter-recommendation-algorithm

  36. cleanlab/cleanlab โญ 10,137
    The standard data-centric AI package for data quality and machine learning with messy, real-world data and labels.
    ๐Ÿ”— cleanlab.ai

  37. epistasislab/tpot โญ 9,831
    A Python Automated Machine Learning tool that optimizes machine learning pipelines using genetic programming.
    ๐Ÿ”— epistasislab.github.io/tpot

  38. megvii-basedetection/YOLOX โญ 9,614
    YOLOX is a high-performance anchor-free YOLO, exceeding yolov3~v5 with MegEngine, ONNX, TensorRT, ncnn, and OpenVINO supported. Documentation: https://yolox.readthedocs.io/

  39. wandb/wandb โญ 9,429
    The AI developer platform. Use Weights & Biases to train and fine-tune models, and manage models from experimentation to production.
    ๐Ÿ”— wandb.ai

  40. pycaret/pycaret โญ 9,105
    An open-source, low-code machine learning library in Python
    ๐Ÿ”— www.pycaret.org

  41. facebookresearch/xformers โญ 8,964
    Hackable and optimized Transformers building blocks, supporting a composable construction.
    ๐Ÿ”— facebookresearch.github.io/xformers

  42. pymc-devs/pymc โญ 8,855
    Bayesian Modeling and Probabilistic Programming in Python
    ๐Ÿ”— docs.pymc.io

  43. uberi/speech_recognition โญ 8,573
    Speech recognition module for Python, supporting several engines and APIs, online and offline.
    ๐Ÿ”— pypi.python.org/pypi/speechrecognition

  44. open-mmlab/mmsegmentation โญ 8,513
    OpenMMLab Semantic Segmentation Toolbox and Benchmark.
    ๐Ÿ”— mmsegmentation.readthedocs.io/en/main

  45. awslabs/autogluon โญ 8,324
    Fast and Accurate ML in 3 Lines of Code
    ๐Ÿ”— auto.gluon.ai

  46. huggingface/accelerate โญ 8,263
    ๐Ÿš€ A simple way to launch, train, and use PyTorch models on almost any device and distributed configuration, automatic mixed precision (including fp8), and easy-to-configure FSDP and DeepSpeed support
    ๐Ÿ”— huggingface.co/docs/accelerate

  47. catboost/catboost โญ 8,225
    A fast, scalable, high performance Gradient Boosting on Decision Trees library, used for ranking, classification, regression and other machine learning tasks for Python, R, Java, C++. Supports computation on CPU and GPU.
    ๐Ÿ”— catboost.ai

  48. automl/auto-sklearn โญ 7,714
    Automated Machine Learning with scikit-learn
    ๐Ÿ”— automl.github.io/auto-sklearn

  49. lmcinnes/umap โญ 7,611
    Uniform Manifold Approximation and Projection

  50. featurelabs/featuretools โญ 7,354
    An open source python library for automated feature engineering
    ๐Ÿ”— www.featuretools.com

  51. hyperopt/hyperopt โญ 7,332
    Distributed Asynchronous Hyperparameter Optimization in Python
    ๐Ÿ”— hyperopt.github.io/hyperopt

  52. py-why/dowhy โญ 7,255
    DoWhy is a Python library for causal inference that supports explicit modeling and testing of causal assumptions. DoWhy is based on a unified language for causal inference, combining causal graphical models and potential outcomes frameworks.
    ๐Ÿ”— www.pywhy.org/dowhy

  53. hips/autograd โญ 7,117
    Efficiently computes derivatives of NumPy code.

  54. open-mmlab/mmagic โญ 7,047
    OpenMMLab Multimodal Advanced, Generative, and Intelligent Creation Toolbox. Unlock the magic ๐Ÿช„: Generative-AI (AIGC), easy-to-use APIs, awsome model zoo, diffusion models, for text-to-image generation, image/video restoration/enhancement, etc.
    ๐Ÿ”— mmagic.readthedocs.io/en/latest

  55. scikit-learn-contrib/imbalanced-learn โญ 6,908
    A Python Package to Tackle the Curse of Imbalanced Datasets in Machine Learning
    ๐Ÿ”— imbalanced-learn.org

  56. ml-explore/mlx-examples โญ 6,815
    Examples in the MLX framework

  57. probml/pyprobml โญ 6,654
    Python code for "Probabilistic Machine learning" book by Kevin Murphy

  58. nicolashug/Surprise โญ 6,488
    A Python scikit for building and analyzing recommender systems
    ๐Ÿ”— surpriselib.com

  59. yangchris11/samurai โญ 6,435
    Official repository of "SAMURAI: Adapting Segment Anything Model for Zero-Shot Visual Tracking with Motion-Aware Memory"
    ๐Ÿ”— yangchris11.github.io/samurai

  60. google/automl โญ 6,304
    Google Brain AutoML

  61. cleverhans-lab/cleverhans โญ 6,236
    An adversarial example library for constructing attacks, building defenses, and benchmarking both

  62. project-monai/MONAI โญ 6,079
    AI Toolkit for Healthcare Imaging
    ๐Ÿ”— monai.io

  63. kevinmusgrave/pytorch-metric-learning โญ 6,078
    The easiest way to use deep metric learning in your application. Modular, flexible, and extensible. Written in PyTorch.
    ๐Ÿ”— kevinmusgrave.github.io/pytorch-metric-learning

  64. open-mmlab/mmcv โญ 5,993
    OpenMMLab Computer Vision Foundation
    ๐Ÿ”— mmcv.readthedocs.io/en/latest

  65. google-deepmind/graphcast โญ 5,793
    GraphCast: Learning skillful medium-range global weather forecasting

  66. uber/causalml โญ 5,199
    Uplift modeling and causal inference with machine learning algorithms

  67. online-ml/river โญ 5,182
    ๐ŸŒŠ Online machine learning in Python
    ๐Ÿ”— riverml.xyz

  68. mdbloice/Augmentor โญ 5,092
    Image augmentation library in Python for machine learning.
    ๐Ÿ”— augmentor.readthedocs.io/en/stable

  69. rasbt/mlxtend โญ 4,960
    A library of extension and helper modules for Python's data analysis and machine learning libraries.
    ๐Ÿ”— rasbt.github.io/mlxtend

  70. marqo-ai/marqo โญ 4,748
    Unified embedding generation and search engine. Also available on cloud - cloud.marqo.ai
    ๐Ÿ”— www.marqo.ai

  71. skvark/opencv-python โญ 4,677
    Automated CI toolchain to produce precompiled opencv-python, opencv-python-headless, opencv-contrib-python and opencv-contrib-python-headless packages.
    ๐Ÿ”— pypi.org/project/opencv-python

  72. apple/coremltools โญ 4,549
    Core ML tools contain supporting tools for Core ML model conversion, editing, and validation.
    ๐Ÿ”— coremltools.readme.io

  73. sanchit-gandhi/whisper-jax โญ 4,530
    JAX implementation of OpenAI's Whisper model for up to 70x speed-up on TPU.

  74. nmslib/hnswlib โญ 4,517
    Header-only C++/python library for fast approximate nearest neighbors
    ๐Ÿ”— github.com/nmslib/hnswlib

  75. lucidrains/deep-daze โญ 4,368
    Simple command line tool for text to image generation using OpenAI's CLIP and Siren (Implicit neural representation network). Technique was originally created by https://twitter.com/advadnoun

  76. districtdatalabs/yellowbrick โญ 4,313
    Visual analysis and diagnostic tools to facilitate machine learning model selection.
    ๐Ÿ”— www.scikit-yb.org

  77. nv-tlabs/GET3D โญ 4,286
    Generative Model of High Quality 3D Textured Shapes Learned from Images

  78. huggingface/autotrain-advanced โญ 4,228
    AutoTrain Advanced: faster and easier training and deployments of state-of-the-art machine learning models
    ๐Ÿ”— huggingface.co/autotrain

  79. microsoft/FLAML โญ 4,035
    A fast library for AutoML and tuning. Join our Discord: https://discord.gg/Cppx2vSPVP.
    ๐Ÿ”— microsoft.github.io/flaml

  80. cmusphinx/pocketsphinx โญ 4,014
    A small speech recognizer

  81. ourownstory/neural_prophet โญ 3,975
    NeuralProphet: A simple forecasting package
    ๐Ÿ”— neuralprophet.com

  82. py-why/EconML โญ 3,948
    ALICE (Automated Learning and Intelligence for Causation and Economics) is a Microsoft Research project aimed at applying Artificial Intelligence concepts to economic decision making. One of its goals is to build a toolkit that combines state-of-the-art machine learning techniques with econometrics in order to brin...
    ๐Ÿ”— www.microsoft.com/en-us/research/project/alice

  83. huggingface/notebooks โญ 3,840
    Notebooks using the Hugging Face libraries ๐Ÿค—

  84. zjunlp/DeepKE โญ 3,711
    [EMNLP 2022] An Open Toolkit for Knowledge Graph Extraction and Construction
    ๐Ÿ”— deepke.zjukg.cn

  85. huggingface/speech-to-speech โญ 3,703
    Speech To Speech: an effort for an open-sourced and modular GPT4-o

  86. rucaibox/RecBole โญ 3,567
    A unified, comprehensive and efficient recommendation library
    ๐Ÿ”— recbole.io

  87. yoheinakajima/instagraph โญ 3,498
    Converts text input or URL into knowledge graph and displays

  88. pytorch/glow โญ 3,267
    Compiler for Neural Network hardware accelerators

  89. lightly-ai/lightly โญ 3,266
    A python library for self-supervised learning on images.
    ๐Ÿ”— docs.lightly.ai/self-supervised-learning

  90. facebookresearch/vissl โญ 3,265
    VISSL is FAIR's library of extensible, modular and scalable components for SOTA Self-Supervised Learning with images.
    ๐Ÿ”— vissl.ai

  91. lucidrains/musiclm-pytorch โญ 3,221
    Implementation of MusicLM, Google's new SOTA model for music generation using attention networks, in Pytorch

  92. hrnet/HRNet-Semantic-Segmentation โญ 3,193
    The OCR approach is rephrased as Segmentation Transformer: https://arxiv.org/abs/1909.11065. This is an official implementation of semantic segmentation for HRNet. https://arxiv.org/abs/1908.07919

  93. mljar/mljar-supervised โญ 3,106
    Python package for AutoML on Tabular Data with Feature Engineering, Hyper-Parameters Tuning, Explanations and Automatic Documentation
    ๐Ÿ”— mljar.com

  94. shankarpandala/lazypredict โญ 3,083
    Lazy Predict help build a lot of basic models without much code and helps understand which models works better without any parameter tuning

  95. huggingface/safetensors โญ 3,060
    Implements a new simple format for storing tensors safely (as opposed to pickle) and that is still fast (zero-copy).
    ๐Ÿ”— huggingface.co/docs/safetensors

  96. scikit-learn-contrib/hdbscan โญ 2,845
    A high performance implementation of HDBSCAN clustering.
    ๐Ÿ”— hdbscan.readthedocs.io/en/latest

  97. scikit-optimize/scikit-optimize โญ 2,754
    Sequential model-based optimization with a scipy.optimize interface
    ๐Ÿ”— scikit-optimize.github.io

  98. google-research/t5x โญ 2,733
    T5X is a modular, composable, research-friendly framework for high-performance, configurable, self-service training, evaluation, and inference of sequence models (starting with language) at many scales.

  99. huggingface/optimum โญ 2,719
    ๐Ÿš€ Accelerate inference and training of ๐Ÿค— Transformers, Diffusers, TIMM and Sentence Transformers with easy to use hardware optimization tools
    ๐Ÿ”— huggingface.co/docs/optimum/main

  100. apple/ml-ane-transformers โญ 2,587
    Reference implementation of the Transformer architecture optimized for Apple Neural Engine (ANE)

  101. freedmand/semantra โญ 2,558
    Semantra is a multipurpose tool for semantically searching documents. Query by meaning rather than just by matching text.

  102. rom1504/clip-retrieval โญ 2,479
    Easily compute clip embeddings and build a clip retrieval system with them
    ๐Ÿ”— rom1504.github.io/clip-retrieval

  103. scikit-learn-contrib/category_encoders โญ 2,422
    A library of sklearn compatible categorical variable encoders
    ๐Ÿ”— contrib.scikit-learn.org/category_encoders

  104. neuraloperator/neuraloperator โญ 2,398
    Comprehensive library for learning neural operators in PyTorch. It is the official implementation for Fourier Neural Operators and Tensorized Neural Operators.
    ๐Ÿ”— neuraloperator.github.io/dev/index.html

  105. priorlabs/TabPFN โญ 2,396
    The TabPFN is a neural network that learned to do tabular data prediction. This is the original CUDA-supporting pytorch impelementation.
    ๐Ÿ”— priorlabs.ai

  106. eric-mitchell/direct-preference-optimization โญ 2,347
    Reference implementation for DPO (Direct Preference Optimization)

  107. huggingface/huggingface_hub โญ 2,297
    The official Python client for the Huggingface Hub.
    ๐Ÿ”— huggingface.co/docs/huggingface_hub

  108. aws/sagemaker-python-sdk โญ 2,126
    A library for training and deploying machine learning models on Amazon SageMaker
    ๐Ÿ”— sagemaker.readthedocs.io

  109. huggingface/evaluate โญ 2,095
    ๐Ÿค— Evaluate: A library for easily evaluating machine learning models and datasets.
    ๐Ÿ”— huggingface.co/docs/evaluate

  110. contextlab/hypertools โญ 1,831
    A Python toolbox for gaining geometric insights into high-dimensional data
    ๐Ÿ”— hypertools.readthedocs.io/en/latest

  111. linkedin/greykite โญ 1,824
    A flexible, intuitive and fast forecasting library

  112. rentruewang/koila โญ 1,822
    Prevent PyTorch's CUDA error: out of memory in just 1 line of code.
    ๐Ÿ”— koila.rentruewang.com

  113. bmabey/pyLDAvis โญ 1,812
    Python library for interactive topic model visualization. Port of the R LDAvis package.

  114. microsoft/Olive โญ 1,734
    Olive: Simplify ML Model Finetuning, Conversion, Quantization, and Optimization for CPUs, GPUs and NPUs.
    ๐Ÿ”— microsoft.github.io/olive

  115. scikit-learn-contrib/lightning โญ 1,732
    Large-scale linear classification, regression and ranking in Python
    ๐Ÿ”— contrib.scikit-learn.org/lightning

  116. qdrant/fastembed โญ 1,729
    Fast, Accurate, Lightweight Python library to make State of the Art Embedding
    ๐Ÿ”— qdrant.github.io/fastembed

  117. castorini/pyserini โญ 1,729
    Pyserini is a Python toolkit for reproducible information retrieval research with sparse and dense representations.
    ๐Ÿ”— pyserini.io

  118. tensorflow/addons โญ 1,694
    Useful extra functionality for TensorFlow 2.x maintained by SIG-addons

  119. microsoft/i-Code โญ 1,683
    The ambition of the i-Code project is to build integrative and composable multimodal AI. The "i" stands for integrative multimodal learning.

  120. stanfordmlgroup/ngboost โญ 1,677
    Natural Gradient Boosting for Probabilistic Prediction

  121. visual-layer/fastdup โญ 1,651
    fastdup is a powerful, free tool designed to rapidly generate valuable insights from image and video datasets. It helps enhance the quality of both images and labels, while significantly reducing data operation costs, all with unmatched scalability.

  122. laekov/fastmoe โญ 1,610
    A fast MoE impl for PyTorch
    ๐Ÿ”— fastmoe.ai

  123. kubeflow/katib โญ 1,536
    Automated Machine Learning on Kubernetes
    ๐Ÿ”— www.kubeflow.org/docs/components/katib

  124. google/vizier โญ 1,531
    Python-based research interface for blackbox and hyperparameter optimization, based on the internal Google Vizier Service.
    ๐Ÿ”— oss-vizier.readthedocs.io

  125. jina-ai/finetuner โญ 1,486
    ๐ŸŽฏ Task-oriented embedding tuning for BERT, CLIP, etc.
    ๐Ÿ”— finetuner.jina.ai

  126. csinva/imodels โญ 1,421
    Interpretable ML package ๐Ÿ” for concise, transparent, and accurate predictive modeling (sklearn-compatible).
    ๐Ÿ”— csinva.io/imodels

  127. microsoft/Semi-supervised-learning โญ 1,408
    A Unified Semi-Supervised Learning Codebase (NeurIPS'22)
    ๐Ÿ”— usb.readthedocs.io

  128. patchy631/machine-learning โญ 1,401
    Machine Learning Tutorials Repository

  129. spotify/voyager โญ 1,383
    ๐Ÿ›ฐ๏ธ An approximate nearest-neighbor search library for Python and Java with a focus on ease of use, simplicity, and deployability.
    ๐Ÿ”— spotify.github.io/voyager

  130. borealisai/advertorch โญ 1,319
    A Toolbox for Adversarial Robustness Research

  131. koaning/scikit-lego โญ 1,299
    Extra blocks for scikit-learn pipelines.
    ๐Ÿ”— koaning.github.io/scikit-lego

  132. awslabs/dgl-ke โญ 1,283
    High performance, easy-to-use, and scalable package for learning large-scale knowledge graph embeddings.
    ๐Ÿ”— dglke.dgl.ai/doc

  133. lightning-ai/lightning-thunder โญ 1,276
    Thunder is a source-to-source compiler for PyTorch. It makes PyTorch programs faster by combining and using different hardware executors at once

  134. pytorch/FBGEMM โญ 1,250
    FB (Facebook) + GEMM (General Matrix-Matrix Multiplication) - https://code.fb.com/ml-applications/fbgemm/

  135. nvidia/cuda-python โญ 1,081
    CUDA Python: Performance meets Productivity
    ๐Ÿ”— nvidia.github.io/cuda-python

  136. davidmrau/mixture-of-experts โญ 1,031
    PyTorch Re-Implementation of "The Sparsely-Gated Mixture-of-Experts Layer" by Noam Shazeer et al. https://arxiv.org/abs/1701.06538

  137. google-research/deeplab2 โญ 1,011
    DeepLab2 is a TensorFlow library for deep labeling, aiming to provide a unified and state-of-the-art TensorFlow codebase for dense pixel labeling tasks.

  138. opentensor/bittensor โญ 997
    Internet-scale Neural Networks
    ๐Ÿ”— www.bittensor.com

  139. oml-team/open-metric-learning โญ 902
    OML is a PyTorch-based framework to train and validate the models producing high-quality embeddings.
    ๐Ÿ”— open-metric-learning.readthedocs.io/en/latest/index.html

  140. hazyresearch/safari โญ 876
    Convolutions for Sequence Modeling

  141. huggingface/optimum-quanto โญ 871
    A pytorch quantization backend for optimum

  142. criteo/autofaiss โญ 832
    Automatically create Faiss knn indices with the most optimal similarity search parameters.
    ๐Ÿ”— criteo.github.io/autofaiss

  143. replicate/replicate-python โญ 792
    Python client for Replicate
    ๐Ÿ”— replicate.com

  144. pymc-labs/pymc-marketing โญ 770
    Bayesian marketing toolbox in PyMC. Media Mix (MMM), customer lifetime value (CLV), buy-till-you-die (BTYD) models and more.
    ๐Ÿ”— www.pymc-marketing.io

  145. awslabs/python-deequ โญ 739
    Python API for Deequ, a library built on Spark for defining "unit tests for data", which measure data quality in large datasets

  146. facebookresearch/balance โญ 691
    The balance python package offers a simple workflow and methods for dealing with biased data samples when looking to infer from them to some target population of interest.
    ๐Ÿ”— import-balance.org

  147. googleapis/python-aiplatform โญ 686
    A Python SDK for Vertex AI, a fully managed, end-to-end platform for data science and machine learning.

  148. nicolas-hbt/pygraft โญ 680
    Configurable Generation of Synthetic Schemas and Knowledge Graphs at Your Fingertips
    ๐Ÿ”— pygraft.readthedocs.io/en/latest

  149. qdrant/quaterion โญ 647
    Blazing fast framework for fine-tuning similarity learning models
    ๐Ÿ”— quaterion.qdrant.tech

  150. huggingface/exporters โญ 640
    Export Hugging Face models to Core ML and TensorFlow Lite

  151. hpcaitech/EnergonAI โญ 628
    Large-scale model inference.

  152. intel/intel-npu-acceleration-library โญ 609
    The Intel NPU Acceleration Library is a Python library designed to boost the efficiency of your applications by leveraging the power of the Intel Neural Processing Unit (NPU) to perform high-speed computations on compatible hardware.

  153. nomic-ai/contrastors โญ 577
    Contrastive learning toolkit that enables researchers and engineers to train and evaluate contrastive models efficiently.

  154. intellabs/bayesian-torch โญ 565
    A library for Bayesian neural network layers and uncertainty estimation in Deep Learning extending the core of PyTorch

  155. microsoft/Focal-Transformer โญ 549
    [NeurIPS 2021 Spotlight] Official code for "Focal Self-attention for Local-Global Interactions in Vision Transformers"

  156. linkedin/FastTreeSHAP โญ 532
    Fast SHAP value computation for interpreting tree-based models

  157. mrdbourke/m1-machine-learning-test โญ 531
    Code for testing various M1 Chip benchmarks with TensorFlow.

  158. nevronai/MetisFL โญ 526
    The first open Federated Learning framework implemented in C++ and Python.
    ๐Ÿ”— metisfl.org

  159. deepgraphlearning/ULTRA โญ 505
    A foundation model for knowledge graph reasoning

  160. dylanhogg/gptauthor โญ 69
    GPTAuthor is an AI tool for writing long form, multi-chapter stories given a story prompt.

Machine Learning - Deep Learning

Machine learning libraries that cross over with deep learning in some way.

  1. tensorflow/tensorflow โญ 187,738
    An Open Source Machine Learning Framework for Everyone
    ๐Ÿ”— tensorflow.org

  2. pytorch/pytorch โญ 86,488
    Tensors and Dynamic neural networks in Python with strong GPU acceleration
    ๐Ÿ”— pytorch.org

  3. openai/whisper โญ 75,567
    Robust Speech Recognition via Large-Scale Weak Supervision

  4. keras-team/keras โญ 62,452
    Deep Learning for humans
    ๐Ÿ”— keras.io

  5. deepfakes/faceswap โญ 53,156
    Deepfakes Software For All
    ๐Ÿ”— www.faceswap.dev

  6. facebookresearch/segment-anything โญ 48,687
    The repository provides code for running inference with the SegmentAnything Model (SAM), links for downloading the trained model checkpoints, and example notebooks that show how to use the model.

  7. microsoft/DeepSpeed โญ 36,513
    DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.
    ๐Ÿ”— www.deepspeed.ai

  8. rwightman/pytorch-image-models โญ 33,053
    The largest collection of PyTorch image encoders / backbones. Including train, eval, inference, export scripts, and pretrained weights -- ResNet, ResNeXT, EfficientNet, NFNet, Vision Transformer (ViT), MobileNetV4, MobileNet-V3 & V2, RegNet, DPN, CSPNet, Swin Transformer, MaxViT, CoAtNet, ConvNeXt, and more
    ๐Ÿ”— huggingface.co/docs/timm

  9. facebookresearch/detectron2 โญ 31,142
    Detectron2 is a platform for object detection, segmentation and other visual recognition tasks.
    ๐Ÿ”— detectron2.readthedocs.io/en/latest

  10. xinntao/Real-ESRGAN โญ 29,449
    Real-ESRGAN aims at developing Practical Algorithms for General Image/Video Restoration.

  11. lightning-ai/pytorch-lightning โญ 28,899
    The deep learning framework to pretrain, finetune and deploy AI models. PyTorch Lightning is just organized PyTorch - Lightning disentangles PyTorch code to decouple the science from the engineering.
    ๐Ÿ”— lightning.ai

  12. google-research/tuning_playbook โญ 27,934
    A playbook for systematically maximizing the performance of deep learning models.

  13. openai/CLIP โญ 27,193
    CLIP (Contrastive Language-Image Pretraining), Predict the most relevant text snippet given an image

  14. facebookresearch/Detectron โญ 26,300
    FAIR's research platform for object detection research, implementing popular algorithms like Mask R-CNN and RetinaNet.

  15. matterport/Mask_RCNN โญ 24,875
    Mask R-CNN for object detection and instance segmentation on Keras and TensorFlow

  16. paddlepaddle/Paddle โญ 22,439
    PArallel Distributed Deep LEarning: Machine Learning Framework from Industrial Practice ๏ผˆใ€Ž้ฃžๆกจใ€ๆ ธๅฟƒๆก†ๆžถ๏ผŒๆทฑๅบฆๅญฆไน &ๆœบๅ™จๅญฆไน ้ซ˜ๆ€ง่ƒฝๅ•ๆœบใ€ๅˆ†ๅธƒๅผ่ฎญ็ปƒๅ’Œ่ทจๅนณๅฐ้ƒจ็ฝฒ๏ผ‰
    ๐Ÿ”— www.paddlepaddle.org

  17. pyg-team/pytorch_geometric โญ 21,828
    Graph Neural Network Library for PyTorch
    ๐Ÿ”— pyg.org

  18. lucidrains/vit-pytorch โญ 21,621
    Implementation of Vision Transformer, a simple way to achieve SOTA in vision classification with only a single transformer encoder, in Pytorch

  19. apache/mxnet โญ 20,787
    Lightweight, Portable, Flexible Distributed/Mobile Deep Learning with Dynamic, Mutation-aware Dataflow Dep Scheduler; for Python, R, Julia, Scala, Go, Javascript and more
    ๐Ÿ”— mxnet.apache.org

  20. sanster/IOPaint โญ 20,309
    Image inpainting tool powered by SOTA AI Model. Remove any unwanted object, defect, people from your pictures or erase and replace(powered by stable diffusion) any thing on your pictures.
    ๐Ÿ”— www.iopaint.com

  21. danielgatis/rembg โญ 17,862
    Rembg is a tool to remove images background

  22. rasbt/deeplearning-models โญ 16,866
    A collection of various deep learning architectures, models, and tips

  23. albumentations-team/albumentations โญ 14,541
    Fast and flexible image augmentation library. Paper about the library: https://www.mdpi.com/2078-2489/11/2/125
    ๐Ÿ”— albumentations.ai

  24. microsoft/Swin-Transformer โญ 14,253
    This is an official implementation for "Swin Transformer: Hierarchical Vision Transformer using Shifted Windows".
    ๐Ÿ”— arxiv.org/abs/2103.14030

  25. facebookresearch/detr โญ 13,938
    End-to-End Object Detection with Transformers

  26. nvidia/DeepLearningExamples โญ 13,895
    State-of-the-Art Deep Learning scripts organized by models - easy to train and deploy with reproducible accuracy and performance on enterprise-grade infrastructure.

  27. dmlc/dgl โญ 13,691
    Python package built to ease deep learning on graph, on top of existing DL frameworks.
    ๐Ÿ”— dgl.ai

  28. mlfoundations/open_clip โญ 10,906
    Open source implementation of OpenAI's CLIP (Contrastive Language-Image Pre-training).

  29. kornia/kornia โญ 10,186
    ๐Ÿ Geometric Computer Vision Library for Spatial AI
    ๐Ÿ”— kornia.readthedocs.io

  30. modelscope/facechain โญ 9,252
    FaceChain is a deep-learning toolchain for generating your Digital-Twin.

  31. keras-team/autokeras โญ 9,194
    AutoML library for deep learning
    ๐Ÿ”— autokeras.com

  32. facebookresearch/pytorch3d โญ 9,016
    PyTorch3D is FAIR's library of reusable components for deep learning with 3D data
    ๐Ÿ”— pytorch3d.org

  33. arogozhnikov/einops โญ 8,694
    Flexible and powerful tensor operations for readable and reliable code (for pytorch, jax, TF and others)
    ๐Ÿ”— einops.rocks

  34. pyro-ppl/pyro โญ 8,644
    Deep universal probabilistic programming with Python and PyTorch
    ๐Ÿ”— pyro.ai

  35. nvidia/apex โญ 8,523
    A PyTorch Extension: Tools for easy mixed precision and distributed training in Pytorch

  36. bytedance/monolith โญ 8,515
    A deep learning framework for large scale recommendation modeling with collisionless embedding and real time training captures.

  37. facebookresearch/ImageBind โญ 8,493
    ImageBind One Embedding Space to Bind Them All

  38. lucidrains/imagen-pytorch โญ 8,164
    Implementation of Imagen, Google's Text-to-Image Neural Network, in Pytorch

  39. google/trax โญ 8,155
    Trax โ€” Deep Learning with Clear Code and Speed

  40. tencent/HunyuanVideo โญ 8,066
    HunyuanVideo: A Systematic Framework For Large Video Generation Model
    ๐Ÿ”— aivideo.hunyuan.tencent.com

  41. xpixelgroup/BasicSR โญ 7,131
    Open Source Image and Video Restoration Toolbox for Super-resolution, Denoise, Deblurring, etc. Currently, it includes EDSR, RCAN, SRResNet, SRGAN, ESRGAN, EDVR, BasicVSR, SwinIR, ECBSR, etc. Also support StyleGAN2, DFDNet.
    ๐Ÿ”— basicsr.readthedocs.io/en/latest

  42. google/flax โญ 6,304
    Flax is a neural network library for JAX that is designed for flexibility.
    ๐Ÿ”— flax.readthedocs.io

  43. skorch-dev/skorch โญ 5,956
    A scikit-learn compatible neural network library that wraps PyTorch

  44. facebookresearch/mmf โญ 5,525
    A modular framework for vision & language multimodal research from Facebook AI Research (FAIR)
    ๐Ÿ”— mmf.sh

  45. mosaicml/composer โญ 5,256
    Supercharge Your Model Training
    ๐Ÿ”— docs.mosaicml.com

  46. deci-ai/super-gradients โญ 4,667
    Easily train or fine-tune SOTA computer vision models with one open source training library. The home of Yolo-NAS.
    ๐Ÿ”— www.supergradients.com

  47. nvidiagameworks/kaolin โญ 4,603
    A PyTorch Library for Accelerating 3D Deep Learning Research

  48. facebookincubator/AITemplate โญ 4,591
    AITemplate is a Python framework which renders neural network into high performance CUDA/HIP C++ code. Specialized for FP16 TensorCore (NVIDIA GPU) and MatrixCore (AMD GPU) inference.

  49. pytorch/ignite โญ 4,576
    High-level library to help with training and evaluating neural networks in PyTorch flexibly and transparently.
    ๐Ÿ”— pytorch-ignite.ai

  50. cvg/LightGlue โญ 3,575
    LightGlue: Local Feature Matching at Light Speed (ICCV 2023)

  51. williamyang1991/VToonify โญ 3,559
    [SIGGRAPH Asia 2022] VToonify: Controllable High-Resolution Portrait Video Style Transfer

  52. google-research/scenic โญ 3,408
    Scenic: A Jax Library for Computer Vision Research and Beyond

  53. facebookresearch/PyTorch-BigGraph โญ 3,395
    Generate embeddings from large-scale graph-structured data.
    ๐Ÿ”— torchbiggraph.readthedocs.io

  54. pytorch/botorch โญ 3,166
    Bayesian optimization in PyTorch
    ๐Ÿ”— botorch.org

  55. alpa-projects/alpa โญ 3,096
    Training and serving large-scale neural networks with auto parallelization.
    ๐Ÿ”— alpa.ai

  56. deepmind/dm-haiku โญ 2,950
    JAX-based neural network library
    ๐Ÿ”— dm-haiku.readthedocs.io

  57. explosion/thinc โญ 2,829
    ๐Ÿ”ฎ A refreshing functional take on deep learning, compatible with your favorite libraries
    ๐Ÿ”— thinc.ai

  58. nerdyrodent/VQGAN-CLIP โญ 2,632
    Just playing with getting VQGAN+CLIP running locally, rather than having to use colab.

  59. danielegrattarola/spektral โญ 2,375
    Graph Neural Networks with Keras and Tensorflow 2.
    ๐Ÿ”— graphneural.network

  60. google-research/electra โญ 2,345
    ELECTRA: Pre-training Text Encoders as Discriminators Rather Than Generators

  61. modelscope/ClearerVoice-Studio โญ 2,150
    An AI-Powered Speech Processing Toolkit and Open Source SOTA Pretrained Models, Supporting Speech Enhancement, Separation, and Target Speaker Extraction, etc.

  62. fepegar/torchio โญ 2,121
    Medical imaging processing for deep learning.
    ๐Ÿ”— torchio.org

  63. neuralmagic/sparseml โญ 2,098
    Libraries for applying sparsification recipes to neural networks with a few lines of code, enabling faster and smaller models

  64. pytorch/torchrec โญ 2,027
    Pytorch domain library for recommendation systems
    ๐Ÿ”— pytorch.org/torchrec

  65. tensorflow/mesh โญ 1,600
    Mesh TensorFlow: Model Parallelism Made Easier

  66. tensorly/tensorly โญ 1,592
    TensorLy: Tensor Learning in Python.
    ๐Ÿ”— tensorly.org

  67. vt-vl-lab/FGVC โญ 1,557
    [ECCV 2020] Flow-edge Guided Video Completion

  68. calculatedcontent/WeightWatcher โญ 1,533
    The WeightWatcher tool for predicting the accuracy of Deep Neural Networks

  69. jeshraghian/snntorch โญ 1,431
    Deep and online learning with spiking neural networks in Python
    ๐Ÿ”— snntorch.readthedocs.io/en/latest

  70. hysts/pytorch_image_classification โญ 1,379
    PyTorch implementation of image classification models for CIFAR-10/CIFAR-100/MNIST/FashionMNIST/Kuzushiji-MNIST/ImageNet

  71. xl0/lovely-tensors โญ 1,180
    Tensors, for human consumption
    ๐Ÿ”— xl0.github.io/lovely-tensors

  72. deepmind/android_env โญ 1,043
    RL research on Android devices.

  73. keras-team/keras-cv โญ 1,018
    Industry-strength Computer Vision workflows with Keras

  74. tensorflow/similarity โญ 1,016
    TensorFlow Similarity is a python package focused on making similarity learning quick and easy.

  75. kakaobrain/rq-vae-transformer โญ 828
    The official implementation of Autoregressive Image Generation using Residual Quantization (CVPR '22)

  76. deepmind/chex โญ 814
    Chex is a library of utilities for helping to write reliable JAX code
    ๐Ÿ”— chex.readthedocs.io

  77. mlfoundations/datacomp โญ 676
    DataComp: In search of the next generation of multimodal datasets
    ๐Ÿ”— datacomp.ai

  78. whitead/dmol-book โญ 635
    Deep learning for molecules and materials book
    ๐Ÿ”— dmol.pub

  79. allenai/reward-bench โญ 498
    RewardBench is a benchmark designed to evaluate the capabilities and safety of reward models (including those trained with Direct Preference Optimization, DPO)
    ๐Ÿ”— huggingface.co/spaces/allenai/reward-bench

Machine Learning - Interpretability

Machine learning interpretability libraries. Covers explainability, prediction explainations, dashboards, understanding knowledge development in training.

  1. slundberg/shap โญ 23,330
    A game theoretic approach to explain the output of any machine learning model.
    ๐Ÿ”— shap.readthedocs.io

  2. marcotcr/lime โญ 11,733
    Lime: Explaining the predictions of any machine learning classifier

  3. interpretml/interpret โญ 6,378
    Fit interpretable models. Explain blackbox machine learning.
    ๐Ÿ”— interpret.ml/docs

  4. pytorch/captum โญ 5,062
    Model interpretability and understanding for PyTorch
    ๐Ÿ”— captum.ai

  5. tensorflow/lucid โญ 4,685
    A collection of infrastructure and tools for research in neural network interpretability.

  6. arize-ai/phoenix โญ 4,627
    AI Observability & Evaluation
    ๐Ÿ”— docs.arize.com/phoenix

  7. pair-code/lit โญ 3,520
    The Learning Interpretability Tool: Interactively analyze ML models to understand their behavior in an extensible and framework agnostic interface.
    ๐Ÿ”— pair-code.github.io/lit

  8. maif/shapash โญ 2,770
    ๐Ÿ”… Shapash: User-friendly Explainability and Interpretability to Develop Reliable and Transparent Machine Learning Models
    ๐Ÿ”— maif.github.io/shapash

  9. teamhg-memex/eli5 โญ 2,763
    A library for debugging/inspecting machine learning classifiers and explaining their predictions
    ๐Ÿ”— eli5.readthedocs.io

  10. seldonio/alibi โญ 2,438
    Algorithms for explaining machine learning models
    ๐Ÿ”— docs.seldon.io/projects/alibi/en/stable

  11. eleutherai/pythia โญ 2,355
    Interpretability analysis and scaling laws to understand how knowledge develops and evolves during training in autoregressive transformers

  12. oegedijk/explainerdashboard โญ 2,349
    Quickly build Explainable AI dashboards that show the inner workings of so-called "blackbox" machine learning models.
    ๐Ÿ”— explainerdashboard.readthedocs.io

  13. jalammar/ecco โญ 2,005
    Explain, analyze, and visualize NLP language models. Ecco creates interactive visualizations directly in Jupyter notebooks explaining the behavior of Transformer-based language models (like GPT2, BERT, RoBERTA, T5, and T0).
    ๐Ÿ”— ecco.readthedocs.io

  14. transformerlensorg/TransformerLens โญ 1,819
    A library for mechanistic interpretability of GPT-style language models
    ๐Ÿ”— transformerlensorg.github.io/transformerlens

  15. google-deepmind/penzai โญ 1,725
    A JAX library for writing models as legible, functional pytree data structures, along with tools for visualizing, modifying, and analyzing them. Penzai focuses on making it easy to do stuff with models after they have been trained
    ๐Ÿ”— penzai.readthedocs.io

  16. trusted-ai/AIX360 โญ 1,654
    Interpretability and explainability of data and machine learning models
    ๐Ÿ”— aix360.res.ibm.com

  17. cdpierse/transformers-interpret โญ 1,318
    Model explainability that works seamlessly with ๐Ÿค— transformers. Explain your transformers model in just 2 lines of code.

  18. selfexplainml/PiML-Toolbox โญ 1,234
    PiML (Python Interpretable Machine Learning) toolbox for model development & diagnostics
    ๐Ÿ”— selfexplainml.github.io/piml-toolbox

  19. ethicalml/xai โญ 1,154
    XAI is a Machine Learning library that is designed with AI explainability in its core. XAI contains various tools that enable for analysis and evaluation of data and models
    ๐Ÿ”— ethical.institute/principles.html#commitment-3

  20. salesforce/OmniXAI โญ 896
    OmniXAI: A Library for eXplainable AI

  21. andyzoujm/representation-engineering โญ 782
    Representation Engineering: A Top-Down Approach to AI Transparency
    ๐Ÿ”— www.ai-transparency.org

  22. jbloomaus/SAELens โญ 605
    Training Sparse Autoencoders on LLms. Analyse sparse autoencoders and neural network internals.
    ๐Ÿ”— jbloomaus.github.io/saelens

Machine Learning - Ops

MLOps tools, frameworks and libraries: intersection of machine learning, data engineering and DevOps; deployment, health, diagnostics and governance of ML models.

  1. apache/airflow โญ 38,571
    Apache Airflow - A platform to programmatically author, schedule, and monitor workflows
    ๐Ÿ”— airflow.apache.org

  2. ray-project/ray โญ 35,169
    Ray is an AI compute engine. Ray consists of a core distributed runtime and a set of AI Libraries for accelerating ML workloads.
    ๐Ÿ”— ray.io

  3. mlflow/mlflow โญ 19,391
    Open source platform for the machine learning lifecycle
    ๐Ÿ”— mlflow.org

  4. prefecthq/prefect โญ 18,189
    Prefect is a workflow orchestration framework for building resilient data pipelines in Python.
    ๐Ÿ”— prefect.io

  5. spotify/luigi โญ 18,062
    Luigi is a Python module that helps you build complex pipelines of batch jobs. It handles dependency resolution, workflow management, visualization etc. It also comes with Hadoop support built in.

  6. kestra-io/kestra โญ 15,729
    โšก Workflow Automation Platform. Orchestrate & Schedule code in any language, run anywhere, 500+ plugins. Alternative to Zapier, Rundeck, Camunda, Airflow...
    ๐Ÿ”— kestra.io

  7. horovod/horovod โญ 14,360
    Distributed training framework for TensorFlow, Keras, PyTorch, and Apache MXNet.
    ๐Ÿ”— horovod.ai

  8. iterative/dvc โญ 14,134
    ๐Ÿฆ‰ Data Versioning and ML Experiments
    ๐Ÿ”— dvc.org

  9. dagster-io/dagster โญ 12,440
    An orchestration platform for the development, production, and observation of data assets.
    ๐Ÿ”— dagster.io

  10. ludwig-ai/ludwig โญ 11,300
    Low-code framework for building custom LLMs, neural networks, and other AI models
    ๐Ÿ”— ludwig.ai

  11. bentoml/OpenLLM โญ 10,503
    Run any open-source LLMs, such as Llama, Mistral, as OpenAI compatible API endpoint in the cloud.
    ๐Ÿ”— bentoml.com

  12. dbt-labs/dbt-core โญ 10,313
    dbt enables data analysts and engineers to transform their data using the same practices that software engineers use to build applications.
    ๐Ÿ”— getdbt.com

  13. great-expectations/great_expectations โญ 10,163
    Always know what to expect from your data.
    ๐Ÿ”— docs.greatexpectations.io

  14. kedro-org/kedro โญ 10,141
    Kedro is a toolbox for production-ready data science. It uses software engineering best practices to help you create data engineering and data science pipelines that are reproducible, maintainable, and modular.
    ๐Ÿ”— kedro.org

  15. huggingface/text-generation-inference โญ 9,694
    A Rust, Python and gRPC server for text generation inference. Used in production at HuggingFace to power Hugging Chat, the Inference API and Inference Endpoint.
    ๐Ÿ”— hf.co/docs/text-generation-inference

  16. netflix/metaflow โญ 8,505
    Open Source AI/ML Platform
    ๐Ÿ”— metaflow.org

  17. activeloopai/deeplake โญ 8,342
    Database for AI. Store Vectors, Images, Texts, Videos, etc. Use with LLMs/LangChain. Store, query, version, & visualize any AI data. Stream data in real-time to PyTorch/TensorFlow. https://activeloop.ai
    ๐Ÿ”— activeloop.ai

  18. langfuse/langfuse โญ 8,258
    ๐Ÿชข Open source LLM engineering platform: LLM Observability, metrics, evals, prompt management, playground, datasets. Integrates with LlamaIndex, Langchain, OpenAI SDK, LiteLLM, and more. ๐ŸŠYC W23
    ๐Ÿ”— langfuse.com/docs

  19. mage-ai/mage-ai โญ 8,120
    ๐Ÿง™ Build, run, and manage data pipelines for integrating and transforming data.
    ๐Ÿ”— www.mage.ai

  20. bentoml/BentoML โญ 7,322
    The easiest way to serve AI apps and models - Build Model Inference APIs, Job queues, LLM apps, Multi-model pipelines, and more!
    ๐Ÿ”— bentoml.com

  21. flyteorg/flyte โญ 5,974
    Scalable and flexible workflow orchestration platform that seamlessly unifies data, ML and analytics stacks.
    ๐Ÿ”— flyte.org

  22. allegroai/clearml โญ 5,799
    ClearML - Auto-Magical CI/CD to streamline your AI workload. Experiment Management, Data Management, Pipeline, Orchestration, Scheduling & Serving in one MLOps/LLMOps solution
    ๐Ÿ”— clear.ml/docs

  23. feast-dev/feast โญ 5,775
    The Open Source Feature Store for Machine Learning
    ๐Ÿ”— feast.dev

  24. evidentlyai/evidently โญ 5,666
    Evidently is โ€‹โ€‹an open-source ML and LLM observability framework. Evaluate, test, and monitor any AI-powered system or data pipeline. From tabular data to Gen AI. 100+ metrics.
    ๐Ÿ”— discord.gg/xzjkranp8b

  25. internlm/lmdeploy โญ 5,417
    LMDeploy is a toolkit for compressing, deploying, and serving LLMs.
    ๐Ÿ”— lmdeploy.readthedocs.io/en/latest

  26. adap/flower โญ 5,394
    Flower: A Friendly Federated AI Framework
    ๐Ÿ”— flower.ai

  27. aimhubio/aim โญ 5,337
    Aim ๐Ÿ’ซ โ€” An easy-to-use & supercharged open-source experiment tracker.
    ๐Ÿ”— aimstack.io

  28. zenml-io/zenml โญ 4,377
    ZenML ๐Ÿ™: The bridge between ML and Ops. https://zenml.io.
    ๐Ÿ”— zenml.io

  29. internlm/xtuner โญ 4,188
    An efficient, flexible and full-featured toolkit for fine-tuning LLM (InternLM2, Llama3, Phi3, Qwen, Mistral, ...)
    ๐Ÿ”— xtuner.readthedocs.io/zh-cn/latest

  30. orchest/orchest โญ 4,105
    Build data pipelines, the easy way ๐Ÿ› ๏ธ
    ๐Ÿ”— orchest.readthedocs.io/en/stable

  31. kubeflow/pipelines โญ 3,665
    Machine Learning Pipelines for Kubeflow
    ๐Ÿ”— www.kubeflow.org/docs/components/pipelines

  32. polyaxon/polyaxon โญ 3,600
    MLOps Tools For Managing & Orchestrating The Machine Learning LifeCycle
    ๐Ÿ”— polyaxon.com

  33. ploomber/ploomber โญ 3,537
    The fastest โšก๏ธ way to build data pipelines. Develop iteratively, deploy anywhere. โ˜๏ธ
    ๐Ÿ”— docs.ploomber.io

  34. towhee-io/towhee โญ 3,294
    Towhee is a framework that is dedicated to making neural data processing pipelines simple and fast.
    ๐Ÿ”— towhee.io

  35. determined-ai/determined โญ 3,093
    Determined is an open-source machine learning platform that simplifies distributed training, hyperparameter tuning, experiment tracking, and resource management. Works with PyTorch and TensorFlow.
    ๐Ÿ”— determined.ai

  36. leptonai/leptonai โญ 2,681
    A Pythonic framework to simplify AI service building
    ๐Ÿ”— lepton.ai

  37. azure/PyRIT โญ 2,161
    The Python Risk Identification Tool for generative AI (PyRIT) is an open access automation framework to empower security professionals and ML engineers to red team foundation models and their applications.
    ๐Ÿ”— azure.github.io/pyrit

  38. dagworks-inc/hamilton โญ 2,010
    Hamilton helps data scientists and engineers define testable, modular, self-documenting dataflows, that encode lineage/tracing and metadata. Runs and scales everywhere python does.
    ๐Ÿ”— hamilton.dagworks.io/en/latest

  39. meltano/meltano โญ 1,934
    Meltano: the declarative code-first data integration engine that powers your wildest data and ML-powered product ideas. Say goodbye to writing, maintaining, and scaling your own API integrations.
    ๐Ÿ”— meltano.com

  40. dstackai/dstack โญ 1,663
    dstack is a lightweight, open-source alternative to Kubernetes & Slurm, simplifying AI container orchestration with multi-cloud & on-prem support. It natively supports NVIDIA, AMD, TPU, and Intel accelerators.
    ๐Ÿ”— dstack.ai/docs

  41. hi-primus/optimus โญ 1,492
    ๐Ÿšš Agile Data Preparation Workflows madeย easy with Pandas, Dask, cuDF, Dask-cuDF, Vaex and PySpark
    ๐Ÿ”— hi-optimus.com

  42. dagworks-inc/burr โญ 1,486
    Build applications that make decisions (chatbots, agents, simulations, etc...). Monitor, trace, persist, and execute on your own infrastructure.
    ๐Ÿ”— burr.dagworks.io

  43. kubeflow/examples โญ 1,425
    A repository to host extended examples and tutorials

Machine Learning - Reinforcement

Machine learning libraries and toolkits that cross over with reinforcement learning in some way: agent reinforcement learning, agent environemnts, RLHF

  1. openai/gym โญ 35,275
    A toolkit for developing and comparing reinforcement learning algorithms.
    ๐Ÿ”— www.gymlibrary.dev

  2. openai/baselines โญ 15,988
    OpenAI Baselines: high-quality implementations of reinforcement learning algorithms

  3. google/dopamine โญ 10,630
    Dopamine is a research framework for fast prototyping of reinforcement learning algorithms.
    ๐Ÿ”— github.com/google/dopamine

  4. thu-ml/tianshou โญ 8,155
    An elegant PyTorch deep reinforcement learning library.
    ๐Ÿ”— tianshou.org

  5. farama-foundation/Gymnasium โญ 8,067
    An API standard for single-agent reinforcement learning environments, with popular reference environments and related utilities (formerly Gym)
    ๐Ÿ”— gymnasium.farama.org

  6. deepmind/pysc2 โญ 8,063
    StarCraft II Learning Environment

  7. lucidrains/PaLM-rlhf-pytorch โญ 7,746
    Implementation of RLHF (Reinforcement Learning with Human Feedback) on top of the PaLM architecture. Basically ChatGPT but with PaLM

  8. tensorlayer/TensorLayer โญ 7,342
    Deep Learning and Reinforcement Learning Library for Scientists and Engineers
    ๐Ÿ”— tensorlayerx.com

  9. keras-rl/keras-rl โญ 5,538
    Deep Reinforcement Learning for Keras.
    ๐Ÿ”— keras-rl.readthedocs.io

  10. deepmind/dm_control โญ 3,907
    Google DeepMind's software stack for physics-based simulation and Reinforcement Learning environments, using MuJoCo.

  11. ai4finance-foundation/ElegantRL โญ 3,835
    Massively Parallel Deep Reinforcement Learning. ๐Ÿ”ฅ
    ๐Ÿ”— ai4finance.org

  12. facebookresearch/ReAgent โญ 3,585
    A platform for Reasoning systems (Reinforcement Learning, Contextual Bandits, etc.)
    ๐Ÿ”— reagent.ai

  13. deepmind/acme โญ 3,573
    A library of reinforcement learning components and agents

  14. opendilab/DI-engine โญ 3,209
    DI-engine is a generalized decision intelligence engine for PyTorch and JAX. It provides python-first and asynchronous-native task and middleware abstractions
    ๐Ÿ”— di-engine-docs.readthedocs.io

  15. eureka-research/Eureka โญ 2,885
    Official Repository for "Eureka: Human-Level Reward Design via Coding Large Language Models" (ICLR 2024)
    ๐Ÿ”— eureka-research.github.io

  16. pettingzoo-team/PettingZoo โญ 2,748
    An API standard for multi-agent reinforcement learning environments, with popular reference environments and related utilities
    ๐Ÿ”— pettingzoo.farama.org

  17. pytorch/rl โญ 2,509
    A modular, primitive-first, python-first PyTorch library for Reinforcement Learning.
    ๐Ÿ”— pytorch.org/rl

  18. kzl/decision-transformer โญ 2,467
    Official codebase for Decision Transformer: Reinforcement Learning via Sequence Modeling.

  19. anthropics/hh-rlhf โญ 1,675
    Human preference data for "Training a Helpful and Harmless Assistant with Reinforcement Learning from Human Feedback"
    ๐Ÿ”— arxiv.org/abs/2204.05862

  20. arise-initiative/robosuite โญ 1,454
    robosuite: A Modular Simulation Framework and Benchmark for Robot Learning
    ๐Ÿ”— robosuite.ai

  21. humancompatibleai/imitation โญ 1,392
    Clean PyTorch implementations of imitation and reward learning algorithms
    ๐Ÿ”— imitation.readthedocs.io

  22. denys88/rl_games โญ 1,000
    RL Games: High performance RL library

  23. google-deepmind/meltingpot โญ 645
    A suite of test scenarios for multi-agent reinforcement learning.

Natural Language Processing

Natural language processing libraries and toolkits: text processing, topic modelling, tokenisers, chatbots. Also see the LLMs and ChatGPT category for crossover.

  1. huggingface/transformers โญ 138,541
    ๐Ÿค— Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.
    ๐Ÿ”— huggingface.co/transformers

  2. pytorch/fairseq โญ 30,880
    Facebook AI Research Sequence-to-Sequence Toolkit written in Python.

  3. explosion/spaCy โญ 30,803
    ๐Ÿ’ซ Industrial-strength Natural Language Processing (NLP) in Python
    ๐Ÿ”— spacy.io

  4. myshell-ai/OpenVoice โญ 30,706
    Instant voice cloning by MIT and MyShell. Audio foundation model.
    ๐Ÿ”— research.myshell.ai/open-voice

  5. microsoft/unilm โญ 20,669
    Large-scale Self-supervised Pre-training Across Tasks, Languages, and Modalities
    ๐Ÿ”— aka.ms/generalai

  6. vikparuchuri/marker โญ 20,263
    Marker converts PDF, EPUB, and MOBI to markdown. It's 10x faster than nougat, more accurate on most documents, and has low hallucination risk.
    ๐Ÿ”— www.datalab.to

  7. huggingface/datasets โญ 19,544
    ๐Ÿค— The largest hub of ready-to-use datasets for ML models with fast, easy-to-use and efficient data manipulation tools
    ๐Ÿ”— huggingface.co/docs/datasets

  8. vikparuchuri/surya โญ 16,037
    OCR, layout analysis, reading order, table recognition in 90+ languages
    ๐Ÿ”— www.datalab.to

  9. ukplab/sentence-transformers โญ 15,895
    State-of-the-Art Text Embeddings
    ๐Ÿ”— www.sbert.net

  10. rare-technologies/gensim โญ 15,821
    Topic Modelling for Humans
    ๐Ÿ”— radimrehurek.com/gensim

  11. gunthercox/ChatterBot โญ 14,178
    ChatterBot is a machine learning, conversational dialog engine for creating chat bots
    ๐Ÿ”— chatterbot.readthedocs.io

  12. flairnlp/flair โญ 14,042
    A very simple framework for state-of-the-art Natural Language Processing (NLP)
    ๐Ÿ”— flairnlp.github.io/flair

  13. nltk/nltk โญ 13,817
    NLTK Source
    ๐Ÿ”— www.nltk.org

  14. m-bain/whisperX โญ 13,665
    WhisperX: Automatic Speech Recognition with Word-level Timestamps (& Diarization)

  15. openai/tiktoken โญ 13,231
    tiktoken is a fast BPE tokeniser for use with OpenAI's models.

  16. nvidia/NeMo โญ 13,015
    A scalable generative AI framework built for researchers and developers working on Large Language Models, Multimodal, and Speech AI (Automatic Speech Recognition and Text-to-Speech)
    ๐Ÿ”— docs.nvidia.com/nemo-framework/user-guide/latest/overview.html

  17. jina-ai/clip-as-service โญ 12,550
    ๐Ÿ„ Scalable embedding, reasoning, ranking for images and sentences with CLIP
    ๐Ÿ”— clip-as-service.jina.ai

  18. allenai/allennlp โญ 11,791
    An open-source NLP research library, built on PyTorch.
    ๐Ÿ”— www.allennlp.org

  19. facebookresearch/seamless_communication โญ 11,268
    Foundational Models for State-of-the-Art Speech and Text Translation

  20. google/sentencepiece โญ 10,544
    Unsupervised text tokenizer for Neural Network-based text generation.

  21. facebookresearch/ParlAI โญ 10,497
    A framework for training and evaluating AI models on a variety of openly available dialogue datasets.
    ๐Ÿ”— parl.ai

  22. neuml/txtai โญ 10,238
    ๐Ÿ’ก All-in-one open-source embeddings database for semantic search, LLM orchestration and language model workflows
    ๐Ÿ”— neuml.github.io/txtai

  23. doccano/doccano โญ 9,737
    Open source annotation tool for machine learning practitioners.

  24. speechbrain/speechbrain โญ 9,302
    A PyTorch-based Speech Toolkit
    ๐Ÿ”— speechbrain.github.io

  25. sloria/TextBlob โญ 9,235
    Simple, Pythonic, text processing--Sentiment analysis, part-of-speech tagging, noun phrase extraction, translation, and more.
    ๐Ÿ”— textblob.readthedocs.io

  26. facebookresearch/nougat โญ 9,190
    Implementation of Nougat Neural Optical Understanding for Academic Documents
    ๐Ÿ”— facebookresearch.github.io/nougat

  27. togethercomputer/OpenChatKit โญ 9,024
    OpenChatKit provides a powerful, open-source base to create both specialized and general purpose chatbots

  28. clips/pattern โญ 8,763
    Web mining module for Python, with tools for scraping, natural language processing, machine learning, network analysis and visualization.
    ๐Ÿ”— github.com/clips/pattern/wiki

  29. espnet/espnet โญ 8,741
    End-to-End Speech Processing Toolkit
    ๐Ÿ”— espnet.github.io/espnet

  30. deeppavlov/DeepPavlov โญ 6,777
    An open source library for deep learning end-to-end dialog systems and chatbots.
    ๐Ÿ”— deeppavlov.ai

  31. facebookresearch/metaseq โญ 6,519
    A codebase for working with Open Pre-trained Transformers, originally forked from fairseq.

  32. maartengr/BERTopic โญ 6,368
    Leveraging BERT and c-TF-IDF to create easily interpretable topics.
    ๐Ÿ”— maartengr.github.io/bertopic

  33. kingoflolz/mesh-transformer-jax โญ 6,322
    Model parallel transformers in JAX and Haiku

  34. aiwaves-cn/agents โญ 5,434
    An Open-source Framework for Data-centric, Self-evolving Autonomous Language Agents

  35. quivrhq/MegaParse โญ 5,226
    File Parser optimised for LLM Ingestion with no loss ๐Ÿง  Parse PDFs, Docx, PPTx in a format that is ideal for LLMs.
    ๐Ÿ”— megaparse.com

  36. layout-parser/layout-parser โญ 5,056
    A Unified Toolkit for Deep Learning Based Document Image Analysis
    ๐Ÿ”— layout-parser.github.io

  37. salesforce/CodeGen โญ 4,994
    CodeGen is a family of open-source model for program synthesis. Trained on TPU-v4. Competitive with OpenAI Codex.

  38. minimaxir/textgenrnn โญ 4,936
    Easily train your own text-generating neural network of any size and complexity on any text dataset with a few lines of code.

  39. makcedward/nlpaug โญ 4,502
    Data augmentation for NLP
    ๐Ÿ”— makcedward.github.io

  40. facebookresearch/DrQA โญ 4,481
    Reading Wikipedia to Answer Open-Domain Questions

  41. argilla-io/argilla โญ 4,283
    Argilla is a collaboration tool for AI engineers and domain experts to build high-quality datasets
    ๐Ÿ”— docs.argilla.io

  42. thilinarajapakse/simpletransformers โญ 4,143
    Transformers for Information Retrieval, Text Classification, NER, QA, Language Modelling, Language Generation, T5, Multi-Modal, and Conversational AI
    ๐Ÿ”— simpletransformers.ai

  43. maartengr/KeyBERT โญ 3,682
    Minimal keyword extraction with BERT
    ๐Ÿ”— maartengr.github.io/keybert

  44. life4/textdistance โญ 3,436
    ๐Ÿ“ Compute distance between sequences. 30+ algorithms, pure python implementation, common interface, optional external libs usage.

  45. promptslab/Promptify โญ 3,387
    Prompt Engineering | Prompt Versioning | Use GPT or other prompt based models to get structured output. Join our discord for Prompt-Engineering, LLMs and other latest research
    ๐Ÿ”— discord.gg/m88xfymbk6

  46. jsvine/markovify โญ 3,320
    A simple, extensible Markov chain generator.

  47. bytedance/lightseq โญ 3,244
    LightSeq: A High Performance Library for Sequence Processing and Generation

  48. errbotio/errbot โญ 3,156
    Errbot is a chatbot, a daemon that connects to your favorite chat service and bring your tools and some fun into the conversation.
    ๐Ÿ”— errbot.io

  49. neuralmagic/deepsparse โญ 3,091
    Sparsity-aware deep learning inference runtime for CPUs
    ๐Ÿ”— neuralmagic.com/deepsparse

  50. huawei-noah/Pretrained-Language-Model โญ 3,054
    Pretrained language model and its related optimization techniques developed by Huawei Noah's Ark Lab.

  51. ddangelov/Top2Vec โญ 2,986
    Top2Vec learns jointly embedded topic, document and word vectors.

  52. jbesomi/texthero โญ 2,897
    Text preprocessing, representation and visualization from zero to hero.
    ๐Ÿ”— texthero.org

  53. salesforce/CodeT5 โญ 2,876
    Home of CodeT5: Open Code LLMs for Code Understanding and Generation
    ๐Ÿ”— arxiv.org/abs/2305.07922

  54. huggingface/neuralcoref โญ 2,865
    โœจFast Coreference Resolution in spaCy with Neural Networks
    ๐Ÿ”— huggingface.co/coref

  55. bigscience-workshop/promptsource โญ 2,757
    Toolkit for creating, sharing and using natural language prompts.

  56. bhavnicksm/chonkie โญ 2,388
    ๐Ÿฆ› CHONK your texts with Chonkie โœจ - The no-nonsense RAG chunking library
    ๐Ÿ”— docs.chonkie.ai

  57. nvidia/nv-ingest โญ 2,383
    NVIDIA-Ingest is a scalable, performance-oriented document content and metadata extraction microservice.

  58. huggingface/setfit โญ 2,335
    SetFit is an efficient and prompt-free framework for few-shot fine-tuning of Sentence Transformers.
    ๐Ÿ”— hf.co/docs/setfit

  59. alibaba/EasyNLP โญ 2,094
    EasyNLP: A Comprehensive and Easy-to-use NLP Toolkit

  60. jamesturk/jellyfish โญ 2,086
    ๐Ÿชผ a python library for doing approximate and phonetic matching of strings.
    ๐Ÿ”— jamesturk.github.io/jellyfish

  61. thudm/P-tuning-v2 โญ 2,002
    An optimized deep prompt tuning strategy comparable to fine-tuning across scales and tasks

  62. marella/ctransformers โญ 1,835
    Python bindings for the Transformer models implemented in C/C++ using GGML library.

  63. featureform/featureform โญ 1,832
    The Virtual Feature Store. Turn your existing data infrastructure into a feature store.
    ๐Ÿ”— www.featureform.com

  64. deepset-ai/FARM โญ 1,750
    ๐Ÿก Fast & easy transfer learning for NLP. Harvesting language models for the industry. Focus on Question Answering.
    ๐Ÿ”— farm.deepset.ai

  65. urchade/GLiNER โญ 1,717
    Generalist and Lightweight Model for Named Entity Recognition (Extract any entity types from texts) @ NAACL 2024
    ๐Ÿ”— arxiv.org/abs/2311.08526

  66. franck-dernoncourt/NeuroNER โญ 1,703
    Named-entity recognition using neural networks. Easy-to-use and state-of-the-art results.
    ๐Ÿ”— neuroner.com

  67. explosion/spacy-models โญ 1,689
    ๐Ÿ’ซ Models for the spaCy Natural Language Processing (NLP) library
    ๐Ÿ”— spacy.io

  68. google-research/language โญ 1,642
    Shared repository for open-sourced projects from the Google AI Language team.
    ๐Ÿ”— ai.google/research/teams/language

  69. plasticityai/magnitude โญ 1,638
    A fast, efficient universal vector embedding utility package.

  70. arxiv-vanity/arxiv-vanity โญ 1,613
    Renders papers from arXiv as responsive web pages so you don't have to squint at a PDF.
    ๐Ÿ”— www.arxiv-vanity.com

  71. chrismattmann/tika-python โญ 1,539
    Tika-Python is a Python binding to the Apache Tikaโ„ข REST services allowing Tika to be called natively in the Python community.

  72. nomic-ai/nomic โญ 1,460
    Interact, analyze and structure massive text, image, embedding, audio and video datasets
    ๐Ÿ”— atlas.nomic.ai

  73. intellabs/fastRAG โญ 1,438
    Efficient Retrieval Augmentation and Generation Framework

  74. dmmiller612/bert-extractive-summarizer โญ 1,417
    Easy to use extractive text summarization with BERT

  75. gunthercox/chatterbot-corpus โญ 1,378
    A multilingual dialog corpus
    ๐Ÿ”— chatterbot-corpus.readthedocs.io

  76. jonasgeiping/cramming โญ 1,311
    Cramming the training of a (BERT-type) language model into limited compute.

  77. pemistahl/lingua-py โญ 1,225
    The most accurate natural language detection library for Python, suitable for short text and mixed-language text

  78. openai/grade-school-math โญ 1,169
    GSM8K, a dataset of 8.5K high quality linguistically diverse grade school math word problems

  79. answerdotai/ModernBERT โญ 1,139
    Bringing BERT into modernity via both architecture changes and scaling
    ๐Ÿ”— arxiv.org/abs/2412.13663

  80. abertsch72/unlimiformer โญ 1,059
    Public repo for the NeurIPS 2023 paper "Unlimiformer: Long-Range Transformers with Unlimited Length Input"

  81. unitaryai/detoxify โญ 997
    Toxic Comment Classification with Pytorch Lightning and Transformers
    ๐Ÿ”— www.unitary.ai

  82. norskregnesentral/skweak โญ 923
    skweak: A software toolkit for weak supervision applied to NLP tasks

  83. keras-team/keras-hub โญ 834
    Pretrained model hub for Keras 3.
    ๐Ÿ”— keras.io/keras_hub

  84. explosion/spacy-streamlit โญ 825
    ๐Ÿ‘‘ spaCy building blocks and visualizers for Streamlit apps
    ๐Ÿ”— share.streamlit.io/ines/spacy-streamlit-demo/master/app.py

  85. paddlepaddle/RocketQA โญ 774
    ๐Ÿš€ RocketQA, dense retrieval for information retrieval and question answering, including both Chinese and English state-of-the-art models.

  86. webis-de/small-text โญ 605
    Small-Text provides state-of-the-art Active Learning for Text Classification. Several pre-implemented Query Strategies, Initialization Strategies, and Stopping Critera are provided, which can be easily mixed and matched to build active learning experiments or applications.
    ๐Ÿ”— small-text.readthedocs.io

  87. babelscape/rebel โญ 511
    REBEL is a seq2seq model that simplifies Relation Extraction (EMNLP 2021).

Packaging

Python packaging, dependency management and bundling.

  1. pyenv/pyenv โญ 40,558
    pyenv lets you easily switch between multiple versions of Python.

  2. astral-sh/uv โญ 38,223
    An extremely fast Python package installer and resolver, written in Rust. Designed as a drop-in replacement for pip and pip-compile.
    ๐Ÿ”— docs.astral.sh/uv

  3. python-poetry/poetry โญ 32,487
    Python packaging and dependency management made easy
    ๐Ÿ”— python-poetry.org

  4. pypa/pipenv โญ 24,970
    A virtualenv management tool that supports a multitude of systems and nicely bridges the gaps between pip, python and virtualenv.
    ๐Ÿ”— pipenv.pypa.io

  5. mitsuhiko/rye โญ 14,000
    a Hassle-Free Python Experience
    ๐Ÿ”— rye.astral.sh

  6. pyinstaller/pyinstaller โญ 12,108
    Freeze (package) Python programs into stand-alone executables
    ๐Ÿ”— www.pyinstaller.org

  7. pypa/pipx โญ 11,056
    Install and Run Python Applications in Isolated Environments
    ๐Ÿ”— pipx.pypa.io

  8. pdm-project/pdm โญ 8,104
    A modern Python package and dependency manager supporting the latest PEP standards
    ๐Ÿ”— pdm-project.org

  9. jazzband/pip-tools โญ 7,830
    A set of tools to keep your pinned Python dependencies fresh (pip-compile + pip-sync)
    ๐Ÿ”— pip-tools.rtfd.io

  10. mamba-org/mamba โญ 7,107
    The Fast Cross-Platform Package Manager: mamba is a reimplementation of the conda package manager in C++
    ๐Ÿ”— mamba.readthedocs.io

  11. conda-forge/miniforge โญ 6,985
    A conda-forge distribution.
    ๐Ÿ”— conda-forge.org/download

  12. conda/conda โญ 6,589
    A system-level, binary package and environment manager running on all major operating systems and platforms.
    ๐Ÿ”— docs.conda.io/projects/conda

  13. pypa/hatch โญ 6,317
    Modern, extensible Python project management
    ๐Ÿ”— hatch.pypa.io/latest

  14. indygreg/PyOxidizer โญ 5,620
    A modern Python application packaging and distribution tool

  15. pypa/virtualenv โญ 4,864
    A tool to create isolated Python environments. Since Python 3.3, a subset of it has been integrated into the standard lib venv module.
    ๐Ÿ”— virtualenv.pypa.io

  16. spack/spack โญ 4,529
    A flexible package manager that supports multiple versions, configurations, platforms, and compilers.
    ๐Ÿ”— spack.io

  17. prefix-dev/pixi โญ 3,808
    pixi is a cross-platform, multi-language package manager and workflow tool built on the foundation of the conda ecosystem.
    ๐Ÿ”— pixi.sh

  18. pantsbuild/pex โญ 3,634
    A tool for generating .pex (Python EXecutable) files, lock files and venvs.
    ๐Ÿ”— docs.pex-tool.org

  19. beeware/briefcase โญ 2,754
    Tools to support converting a Python project into a standalone native application.
    ๐Ÿ”— briefcase.readthedocs.io

  20. pypa/flit โญ 2,190
    Simplified packaging of Python modules
    ๐Ÿ”— flit.pypa.io

  21. linkedin/shiv โญ 1,794
    shiv is a command line utility for building fully self contained Python zipapps as outlined in PEP 441, but with all their dependencies included.

  22. marcelotduarte/cx_Freeze โญ 1,399
    cx_Freeze creates standalone executables from Python scripts, with the same performance, is cross-platform and should work on any platform that Python itself works on.
    ๐Ÿ”— marcelotduarte.github.io/cx_freeze

  23. ofek/pyapp โญ 1,323
    Runtime installer for Python applications
    ๐Ÿ”— ofek.dev/pyapp

  24. pypa/gh-action-pypi-publish โญ 989
    The blessed :octocat: GitHub Action, for publishing your ๐Ÿ“ฆ distribution files to PyPI, the tokenless way: https://github.com/marketplace/actions/pypi-publish
    ๐Ÿ”— packaging.python.org/guides/publishing-package-distribution-releases-using-github-actions-ci-cd-workflows

  25. py2exe/py2exe โญ 899
    Create standalone Windows programs from Python code
    ๐Ÿ”— www.py2exe.org

  26. prefix-dev/rip โญ 656
    RIP is a library that allows the resolving and installing of Python PyPI packages from Rust into a virtual environment. It's based on our experience with building Rattler and aims to provide the same experience but for PyPI instead of Conda.
    ๐Ÿ”— prefix.dev

  27. snok/install-poetry โญ 605
    Github action for installing and configuring Poetry

  28. python-poetry/install.python-poetry.org โญ 218
    The official Poetry installation script
    ๐Ÿ”— install.python-poetry.org

Pandas

Pandas and dataframe libraries: data analysis, statistical reporting, pandas GUIs, pandas performance optimisations.

  1. pandas-dev/pandas โญ 44,478
    Flexible and powerful data analysis / manipulation library for Python, providing labeled data structures similar to R data.frame objects, statistical functions, and much more
    ๐Ÿ”— pandas.pydata.org

  2. pola-rs/polars โญ 31,696
    Dataframes powered by a multithreaded, vectorized query engine, written in Rust
    ๐Ÿ”— docs.pola.rs

  3. duckdb/duckdb โญ 26,205
    DuckDB is an analytical in-process SQL database management system
    ๐Ÿ”— www.duckdb.org

  4. gventuri/pandas-ai โญ 14,106
    Chat with your database (SQL, CSV, pandas, polars, mongodb, noSQL, etc). PandasAI makes data analysis conversational using LLMs (GPT 3.5 / 4, Anthropic, VertexAI) and RAG.
    ๐Ÿ”— getpanda.ai

  5. kanaries/pygwalker โญ 14,010
    PyGWalker: Turn your pandas dataframe into an interactive UI for visual analysis
    ๐Ÿ”— kanaries.net/pygwalker

  6. ydataai/ydata-profiling โญ 12,686
    1 Line of code data quality profiling & exploratory data analysis for Pandas and Spark DataFrames.
    ๐Ÿ”— docs.profiling.ydata.ai

  7. rapidsai/cudf โญ 8,641
    cuDF is a GPU DataFrame library for loading joining, aggregating, filtering, and otherwise manipulating data
    ๐Ÿ”— docs.rapids.ai/api/cudf/stable

  8. aws/aws-sdk-pandas โญ 3,974
    pandas on AWS - Easy integration with Athena, Glue, Redshift, Timestream, Neptune, OpenSearch, QuickSight, Chime, CloudWatchLogs, DynamoDB, EMR, SecretManager, PostgreSQL, MySQL, SQLServer and S3 (Parquet, CSV, JSON and EXCEL).
    ๐Ÿ”— aws-sdk-pandas.readthedocs.io

  9. nalepae/pandarallel โญ 3,716
    A simple and efficient tool to parallelize Pandas operations on all availableย CPUs
    ๐Ÿ”— nalepae.github.io/pandarallel

  10. unionai-oss/pandera โญ 3,597
    A light-weight, flexible, and expressive statistical data testing library
    ๐Ÿ”— www.union.ai/pandera

  11. adamerose/PandasGUI โญ 3,213
    A GUI for Pandas DataFrames

  12. blaze/blaze โญ 3,191
    NumPy and Pandas interface to Big Data
    ๐Ÿ”— blaze.pydata.org

  13. pydata/pandas-datareader โญ 3,000
    Extract data from a wide range of Internet sources into a pandas DataFrame.
    ๐Ÿ”— pydata.github.io/pandas-datareader/stable/index.html

  14. scikit-learn-contrib/sklearn-pandas โญ 2,820
    Pandas integration with sklearn

  15. jmcarpenter2/swifter โญ 2,564
    A package which efficiently applies any function to a pandas dataframe or series in the fastest available manner

  16. delta-io/delta-rs โญ 2,538
    A native Rust library for Delta Lake, with bindings into Python
    ๐Ÿ”— delta-io.github.io/delta-rs

  17. eventual-inc/Daft โญ 2,511
    Distributed data engine for Python/SQL designed for the cloud, powered by Rust
    ๐Ÿ”— getdaft.io

  18. fugue-project/fugue โญ 2,037
    A unified interface for distributed computing. Fugue executes SQL, Python, Pandas, and Polars code on Spark, Dask and Ray without any rewrites.
    ๐Ÿ”— fugue-tutorials.readthedocs.io

  19. pyjanitor-devs/pyjanitor โญ 1,390
    Clean APIs for data cleaning. Python implementation of R package Janitor
    ๐Ÿ”— pyjanitor-devs.github.io/pyjanitor

  20. machow/siuba โญ 1,167
    Python library for using dplyr like syntax with pandas and SQL
    ๐Ÿ”— siuba.org

  21. holoviz/hvplot โญ 1,162
    A high-level plotting API for pandas, dask, xarray, and networkx built on HoloViews
    ๐Ÿ”— hvplot.holoviz.org

  22. renumics/spotlight โญ 1,145
    Interactively explore unstructured datasets from your dataframe.
    ๐Ÿ”— renumics.com

  23. tkrabel/bamboolib โญ 942
    bamboolib - a GUI for pandas DataFrames
    ๐Ÿ”— bamboolib.com

  24. mwouts/itables โญ 824
    This packages changes how Pandas and Polars DataFrames are rendered in Jupyter Notebooks. With itables you can display your tables as interactive DataTables that you can sort, paginate, scroll or filter.
    ๐Ÿ”— mwouts.github.io/itables

Performance

Performance, parallelisation and low level libraries.

  1. celery/celery โญ 25,442
    Distributed Task Queue (development branch)
    ๐Ÿ”— docs.celeryq.dev

  2. google/flatbuffers โญ 23,718
    FlatBuffers: Memory Efficient Serialization Library
    ๐Ÿ”— flatbuffers.dev

  3. pybind/pybind11 โญ 16,118
    Seamless operability between C++11 and Python
    ๐Ÿ”— pybind11.readthedocs.io

  4. exaloop/codon โญ 15,348
    A high-performance, zero-overhead, extensible Python compiler with built-in NumPy support
    ๐Ÿ”— docs.exaloop.io/codon

  5. dask/dask โญ 12,905
    Parallel computing with task scheduling
    ๐Ÿ”— dask.org

  6. numba/numba โญ 10,171
    NumPy aware dynamic Python compiler using LLVM
    ๐Ÿ”— numba.pydata.org

  7. modin-project/modin โญ 10,002
    Modin: Scale your Pandas workflows by changing a single line of code
    ๐Ÿ”— modin.readthedocs.io

  8. nebuly-ai/optimate โญ 8,372
    A collection of libraries to optimise AI model performances
    ๐Ÿ”— www.nebuly.com

  9. vaexio/vaex โญ 8,332
    Out-of-Core hybrid Apache Arrow/NumPy DataFrame for Python, ML, visualization and exploration of big tabular data at a billion rows per second ๐Ÿš€
    ๐Ÿ”— vaex.io

  10. mher/flower โญ 6,580
    Real-time monitor and web admin for Celery distributed task queue
    ๐Ÿ”— flower.readthedocs.io

  11. python-trio/trio โญ 6,337
    Trio โ€“ a friendly Python library for async concurrency and I/O
    ๐Ÿ”— trio.readthedocs.io

  12. ultrajson/ultrajson โญ 4,364
    Ultra fast JSON decoder and encoder written in C with Python bindings
    ๐Ÿ”— pypi.org/project/ujson

  13. tlkh/asitop โญ 3,787
    Perf monitoring CLI tool for Apple Silicon
    ๐Ÿ”— tlkh.github.io/asitop

  14. facebookincubator/cinder โญ 3,561
    Cinder is Meta's internal performance-oriented production version of CPython.
    ๐Ÿ”— trycinder.com

  15. airtai/faststream โญ 3,419
    FastStream is a powerful and easy-to-use Python framework for building asynchronous services interacting with event streams such as Apache Kafka, RabbitMQ, NATS and Redis.
    ๐Ÿ”— faststream.airt.ai/latest

  16. ipython/ipyparallel โญ 2,603
    IPython Parallel: Interactive Parallel Computing in Python
    ๐Ÿ”— ipyparallel.readthedocs.io

  17. intel/intel-extension-for-transformers โญ 2,157
    โšก Build your chatbot within minutes on your favorite device; offer SOTA compression techniques for LLMs; run LLMs efficiently on Intel Platformsโšก

  18. h5py/h5py โญ 2,108
    HDF5 for Python -- The h5py package is a Pythonic interface to the HDF5 binary data format.
    ๐Ÿ”— www.h5py.org

  19. agronholm/anyio โญ 1,905
    High level asynchronous concurrency and networking framework that works on top of either trio or asyncio

  20. tiangolo/asyncer โญ 1,787
    Asyncer, async and await, focused on developer experience.
    ๐Ÿ”— asyncer.tiangolo.com

  21. intel/intel-extension-for-pytorch โญ 1,721
    A Python package for extending the official PyTorch that can easily obtain performance on Intel platform

  22. faster-cpython/ideas โญ 1,705
    Discussion and work tracker for Faster CPython project.

  23. dask/distributed โญ 1,595
    A distributed task scheduler for Dask
    ๐Ÿ”— distributed.dask.org

  24. nschloe/perfplot โญ 1,357
    ๐Ÿ“ˆ Performance analysis for Python snippets

  25. intel/scikit-learn-intelex โญ 1,247
    Extension for Scikit-learn is a seamless way to speed up your Scikit-learn application
    ๐Ÿ”— uxlfoundation.github.io/scikit-learn-intelex

  26. markshannon/faster-cpython โญ 945
    How to make CPython faster.

  27. zerointensity/pointers.py โญ 920
    Bringing the hell of pointers to Python.
    ๐Ÿ”— pointers.zintensity.dev

  28. brandtbucher/specialist โญ 647
    Visualize CPython's specializing, adaptive interpreter. ๐Ÿ”ฅ

Profiling

Memory and CPU/GPU profiling tools and libraries.

  1. bloomberg/memray โญ 13,638
    Memray is a memory profiler for Python
    ๐Ÿ”— bloomberg.github.io/memray

  2. benfred/py-spy โญ 13,201
    Sampling profiler for Python programs

  3. plasma-umass/scalene โญ 12,399
    Scalene: a high-performance, high-precision CPU, GPU, and memory profiler for Python with AI-powered optimization proposals

  4. joerick/pyinstrument โญ 6,833
    ๐Ÿšดย Call stack profiler for Python. Shows you why your code is slow!
    ๐Ÿ”— pyinstrument.readthedocs.io

  5. gaogaotiantian/viztracer โญ 5,926
    A debugging and profiling tool that can trace and visualize python code execution
    ๐Ÿ”— viztracer.readthedocs.io

  6. pythonprofilers/memory_profiler โญ 4,433
    Monitor Memory usage of Python code
    ๐Ÿ”— pypi.python.org/pypi/memory_profiler

  7. reloadware/reloadium โญ 2,845
    Hot Reloading and Profiling for Python

  8. pyutils/line_profiler โญ 2,830
    Line-by-line profiling for Python

  9. jiffyclub/snakeviz โญ 2,385
    An in-browser Python profile viewer
    ๐Ÿ”— jiffyclub.github.io/snakeviz

  10. p403n1x87/austin โญ 1,980
    Python frame stack sampler for CPython
    ๐Ÿ”— pypi.org/project/austin-dist

  11. pythonspeed/filprofiler โญ 857
    A Python memory profiler for data processing and scientific computing applications
    ๐Ÿ”— pythonspeed.com/products/filmemoryprofiler

Security

Security related libraries: vulnerability discovery, SQL injection, environment auditing.

  1. swisskyrepo/PayloadsAllTheThings โญ 62,972
    A list of useful payloads and bypass for Web Application Security and Pentest/CTF
    ๐Ÿ”— swisskyrepo.github.io/payloadsallthethings

  2. sqlmapproject/sqlmap โญ 33,211
    Automatic SQL injection and database takeover tool
    ๐Ÿ”— sqlmap.org

  3. certbot/certbot โญ 31,871
    Certbot is EFF's tool to obtain certs from Let's Encrypt and (optionally) auto-enable HTTPS on your server. It can also act as a client for any other CA that uses the ACME protocol.

  4. aquasecurity/trivy โญ 24,542
    Find vulnerabilities, misconfigurations, secrets, SBOM in containers, Kubernetes, code repositories, clouds and more
    ๐Ÿ”— trivy.dev

  5. bridgecrewio/checkov โญ 7,339
    Checkov is a static code analysis tool for infrastructure as code (IaC) and also a software composition analysis (SCA) tool for images and open source packages.
    ๐Ÿ”— www.checkov.io

  6. nccgroup/ScoutSuite โญ 6,910
    Multi-Cloud Security Auditing Tool

  7. stamparm/maltrail โญ 6,723
    Malicious traffic detection system

  8. pycqa/bandit โญ 6,703
    Bandit is a tool designed to find common security issues in Python code.
    ๐Ÿ”— bandit.readthedocs.io

  9. rhinosecuritylabs/pacu โญ 4,510
    The AWS exploitation framework, designed for testing the security of Amazon Web Services environments.
    ๐Ÿ”— rhinosecuritylabs.com/aws/pacu-open-source-aws-exploitation-framework

  10. dashingsoft/pyarmor โญ 3,904
    A tool used to obfuscate python scripts, bind obfuscated scripts to fixed machine or expire obfuscated scripts.
    ๐Ÿ”— pyarmor.dashingsoft.com

  11. pyupio/safety โญ 1,789
    Safety checks Python dependencies for known security vulnerabilities and suggests the proper remediations for vulnerabilities detected.
    ๐Ÿ”— safetycli.com/product/safety-cli

  12. trailofbits/pip-audit โญ 1,005
    Audits Python environments, requirements files and dependency trees for known security vulnerabilities, and can automatically fix them
    ๐Ÿ”— pypi.org/project/pip-audit

  13. fadi002/de4py โญ 877
    toolkit for python reverse engineering
    ๐Ÿ”— de4py.000.pe

  14. thecyb3ralpha/BobTheSmuggler โญ 506
    A tool that leverages HTML Smuggling Attack and allows you to create HTML files with embedded 7z/zip archives.

Simulation

Simulation libraries: robotics, economic, agent-based, traffic, physics, astronomy, chemistry, quantum simulation. Also see the Maths and Science category for crossover.

  1. atsushisakai/PythonRobotics โญ 24,151
    Python sample codes and textbook for robotics algorithms.
    ๐Ÿ”— atsushisakai.github.io/pythonrobotics

  2. genesis-embodied-ai/Genesis โญ 23,535
    Genesis is a physics platform, and generative data engine, designed for general purpose Robotics/Embodied AI/Physical AI applications

  3. bulletphysics/bullet3 โญ 12,950
    Bullet Physics SDK: real-time collision detection and multi-physics simulation for VR, games, visual effects, robotics, machine learning etc.
    ๐Ÿ”— bulletphysics.org

  4. isl-org/Open3D โญ 11,861
    Open3D: A Modern Library for 3D Data Processing
    ๐Ÿ”— www.open3d.org

  5. dlr-rm/stable-baselines3 โญ 9,673
    Stable Baselines3 (SB3) is a set of reliable implementations of reinforcement learning algorithms in PyTorch
    ๐Ÿ”— stable-baselines3.readthedocs.io

  6. nvidia/Cosmos โญ 7,364
    NVIDIA Cosmos is a developer-first world foundation model platform designed to help Physical AI developers build their Physical AI systems better and faster.

  7. qiskit/qiskit โญ 5,620
    Qiskit is an open-source SDK for working with quantum computers at the level of extended quantum circuits, operators, and primitives.
    ๐Ÿ”— www.ibm.com/quantum/qiskit

  8. astropy/astropy โญ 4,568
    Astronomy and astrophysics core library
    ๐Ÿ”— www.astropy.org

  9. nvidia/warp โญ 4,504
    A Python framework for high performance GPU simulation and graphics
    ๐Ÿ”— nvidia.github.io/warp

  10. quantumlib/Cirq โญ 4,429
    A Python framework for creating, editing, and invoking Noisy Intermediate-Scale Quantum (NISQ) circuits.
    ๐Ÿ”— quantumai.google/cirq

  11. chakazul/Lenia โญ 3,573
    Lenia is a 2D cellular automata with continuous space, time and states. It produces a huge variety of interesting methematical life forms
    ๐Ÿ”— chakazul.github.io/lenia/javascript/lenia.html

  12. openai/mujoco-py โญ 2,918
    MuJoCo is a physics engine for detailed, efficient rigid body simulations with contacts. mujoco-py allows using MuJoCo from Python 3.

  13. rdkit/rdkit โญ 2,772
    The official sources for the RDKit library

  14. projectmesa/mesa โญ 2,762
    Mesa is an open-source Python library for agent-based modeling, ideal for simulating complex systems and exploring emergent behaviors.
    ๐Ÿ”— mesa.readthedocs.io

  15. nvidia-omniverse/IsaacLab โญ 2,712
    Unified framework for robot learning built on NVIDIA Isaac Sim
    ๐Ÿ”— isaac-sim.github.io/isaaclab

  16. taichi-dev/difftaichi โญ 2,538
    10 differentiable physical simulators built with Taichi differentiable programming (DiffTaichi, ICLR 2020)

  17. google/brax โญ 2,507
    Massively parallel rigidbody physics simulation on accelerator hardware.

  18. dlr-rm/rl-baselines3-zoo โญ 2,226
    A training framework for Stable Baselines3 reinforcement learning agents, with hyperparameter optimization and pre-trained agents included.
    ๐Ÿ”— rl-baselines3-zoo.readthedocs.io

  19. nvidia-omniverse/IsaacGymEnvs โญ 2,191
    Example RL environments for the NVIDIA Isaac Gym high performance environments

  20. facebookresearch/habitat-lab โญ 2,099
    A modular high-level library to train embodied AI agents across a variety of tasks and environments.
    ๐Ÿ”— aihabitat.org

  21. quantecon/QuantEcon.py โญ 2,037
    A community based Python library for quantitative economics
    ๐Ÿ”— quantecon.org/quantecon-py

  22. microsoft/PromptCraft-Robotics โญ 1,947
    Community for applying LLMs to robotics and a robot simulator with ChatGPT integration
    ๐Ÿ”— aka.ms/chatgpt-robotics

  23. eloialonso/diamond โญ 1,715
    DIAMOND (DIffusion As a Model Of eNvironment Dreams) is a reinforcement learning agent trained in a diffusion world model
    ๐Ÿ”— diamond-wm.github.io

  24. deepmodeling/deepmd-kit โญ 1,568
    A deep learning package for many-body potential energy representation and molecular dynamics
    ๐Ÿ”— docs.deepmodeling.com/projects/deepmd

  25. sail-sg/envpool โญ 1,120
    C++-based high-performance parallel environment execution engine (vectorized env) for general RL environments.
    ๐Ÿ”— envpool.readthedocs.io

  26. bowang-lab/scGPT โญ 1,109
    scGPT: Towards Building a Foundation Model for Single-Cell Multi-omics Using Generative AI
    ๐Ÿ”— scgpt.readthedocs.io/en/latest

  27. a-r-j/graphein โญ 1,058
    Protein Graph Library
    ๐Ÿ”— graphein.ai

  28. viblo/pymunk โญ 955
    Pymunk is a easy-to-use pythonic 2d physics library that can be used whenever you need 2d rigid body physics from Python
    ๐Ÿ”— www.pymunk.org

  29. google-deepmind/materials_discovery โญ 931
    Graph Networks for Materials Science (GNoME) is a project centered around scaling machine learning methods to tackle materials science.

  30. nvidia-omniverse/OmniIsaacGymEnvs โญ 919
    Reinforcement Learning Environments for Omniverse Isaac Gym

  31. altera-al/project-sid โญ 908
    This repository contains our technical report: "Project Sid: Many-agent simulations toward AI civilization"

  32. google/evojax โญ 867
    EvoJAX is a scalable, general purpose, hardware-accelerated neuroevolution toolkit built on the JAX library

  33. facebookresearch/fairo โญ 860
    A modular embodied agent architecture and platform for building embodied agents

  34. eureka-research/DrEureka โญ 838
    Official Repository for "DrEureka: Language Model Guided Sim-To-Real Transfer" (RSS 2024)
    ๐Ÿ”— eureka-research.github.io/dr-eureka

  35. polymathicai/the_well โญ 738
    15TB of Physics Simulations: collection of machine learning datasets containing numerical simulations of a wide variety of spatiotemporal physical systems.
    ๐Ÿ”— polymathic-ai.org/the_well

  36. ur-whitelab/chemcrow-public โญ 680
    Chemcrow

  37. ur-whitelab/chemcrow-runs โญ 77
    ur-whitelab/chemcrow-runs

Study

Miscellaneous study resources: algorithms, general resources, system design, code repos for textbooks, best practices, tutorials.

  1. thealgorithms/Python โญ 196,978
    All Algorithms implemented in Python
    ๐Ÿ”— thealgorithms.github.io/python

  2. microsoft/generative-ai-for-beginners โญ 70,017
    21 Lessons, Get Started Building with Generative AI ๐Ÿ”— https://microsoft.github.io/generative-ai-for-beginners/
    ๐Ÿ”— microsoft.github.io/generative-ai-for-beginners

  3. mlabonne/llm-course โญ 45,322
    Course to get into Large Language Models (LLMs) with roadmaps and Colab notebooks.
    ๐Ÿ”— mlabonne.github.io/blog

  4. jakevdp/PythonDataScienceHandbook โญ 43,768
    Python Data Science Handbook: full text in Jupyter Notebooks
    ๐Ÿ”— jakevdp.github.io/pythondatasciencehandbook

  5. rasbt/LLMs-from-scratch โญ 39,139
    Implement a ChatGPT-like LLM in PyTorch from scratch, step by step
    ๐Ÿ”— amzn.to/4fqvn0d

  6. realpython/python-guide โญ 28,607
    Python best practices guidebook, written for humans.
    ๐Ÿ”— docs.python-guide.org

  7. d2l-ai/d2l-en โญ 24,823
    Interactive deep learning book with multi-framework code, math, and discussions. Adopted at 500 universities from 70 countries including Stanford, MIT, Harvard, and Cambridge.
    ๐Ÿ”— d2l.ai

  8. christoschristofidis/awesome-deep-learning โญ 24,733
    A curated list of awesome Deep Learning tutorials, projects and communities.

  9. wesm/pydata-book โญ 22,643
    Materials and IPython notebooks for "Python for Data Analysis" by Wes McKinney, published by O'Reilly Media

  10. hannibal046/Awesome-LLM โญ 21,185
    Awesome-LLM: a curated list of Large Language Model

  11. microsoft/recommenders โญ 19,700
    Best Practices on Recommendation Systems
    ๐Ÿ”— recommenders-team.github.io/recommenders/intro.html

  12. fchollet/deep-learning-with-python-notebooks โญ 18,946
    Jupyter notebooks for the code samples of the book "Deep Learning with Python"

  13. graykode/nlp-tutorial โญ 14,401
    Natural Language Processing Tutorial for Deep Learning Researchers
    ๐Ÿ”— www.reddit.com/r/machinelearning/comments/amfinl/project_nlptutoral_repository_who_is_studying

  14. naklecha/llama3-from-scratch โญ 14,085
    llama3 implementation one matrix multiplication at a time

  15. shangtongzhang/reinforcement-learning-an-introduction โญ 13,790
    Python Implementation of Reinforcement Learning: An Introduction

  16. karpathy/nn-zero-to-hero โญ 13,092
    Neural Networks: Zero to Hero

  17. mrdbourke/pytorch-deep-learning โญ 12,210
    Materials for the Learn PyTorch for Deep Learning: Zero to Mastery course.
    ๐Ÿ”— learnpytorch.io

  18. eugeneyan/open-llms โญ 11,606
    ๐Ÿ“‹ A list of open LLMs available for commercial use.

  19. karpathy/micrograd โญ 11,061
    A tiny scalar-valued autograd engine and a neural net library on top of it with PyTorch-like API

  20. rucaibox/LLMSurvey โญ 10,910
    The official GitHub page for the survey paper "A Survey of Large Language Models".
    ๐Ÿ”— arxiv.org/abs/2303.18223

  21. openai/spinningup โญ 10,458
    An educational resource to help anyone learn deep reinforcement learning.
    ๐Ÿ”— spinningup.openai.com

  22. srush/GPU-Puzzles โญ 10,441
    Teaching beginner GPU programming in a completely interactive fashion

  23. zhanymkanov/fastapi-best-practices โญ 10,231
    FastAPI Best Practices and Conventions we used at our startup

  24. nielsrogge/Transformers-Tutorials โญ 9,880
    This repository contains demos I made with the Transformers library by HuggingFace.

  25. mooler0410/LLMsPracticalGuide โญ 9,674
    A curated list of practical guide resources of LLMs (LLMs Tree, Examples, Papers)
    ๐Ÿ”— arxiv.org/abs/2304.13712v2

  26. firmai/industry-machine-learning โญ 7,293
    A curated list of applied machine learning and data science notebooks and libraries across different industries (by @firmai)
    ๐Ÿ”— www.sov.ai

  27. udlbook/udlbook โญ 6,961
    Understanding Deep Learning - Simon J.D. Prince

  28. gkamradt/langchain-tutorials โญ 6,895
    Overview and tutorial of the LangChain Library

  29. roboflow/notebooks โญ 6,730
    This repository offers a comprehensive collection of tutorials on state-of-the-art computer vision models and techniques. Explore everything from foundational architectures like ResNet to cutting-edge models like YOLO11, RT-DETR, SAM 2, Florence-2, PaliGemma 2, and Qwen2.5VL.
    ๐Ÿ”— roboflow.com/models

  30. neetcode-gh/leetcode โญ 5,879
    Leetcode solutions for NeetCode.io

  31. mrdbourke/tensorflow-deep-learning โญ 5,421
    All course materials for the Zero to Mastery Deep Learning with TensorFlow course.
    ๐Ÿ”— dbourke.link/ztmtfcourse

  32. alirezadir/Machine-Learning-Interviews โญ 5,389
    This repo is meant to serve as a guide for Machine Learning/AI technical interviews.

  33. udacity/deep-learning-v2-pytorch โญ 5,357
    Projects and exercises for the latest Deep Learning ND program https://www.udacity.com/course/deep-learning-nanodegree--nd101

  34. huggingface/smol-course โญ 5,233
    a practical course on aligning language models for your specific use case. It's a handy way to get started with aligning language models, because everything runs on most local machines.

  35. timofurrer/awesome-asyncio โญ 4,701
    A curated list of awesome Python asyncio frameworks, libraries, software and resources

  36. zotroneneis/machine_learning_basics โญ 4,343
    Plain python implementations of basic machine learning algorithms

  37. handsonllm/Hands-On-Large-Language-Models โญ 4,311
    Official code repo for the O'Reilly Book - "Hands-On Large Language Models"
    ๐Ÿ”— www.llm-book.com

  38. promptslab/Awesome-Prompt-Engineering โญ 4,145
    This repository contains a hand-curated resources for Prompt Engineering with a focus on Generative Pre-trained Transformer (GPT), ChatGPT, PaLM etc
    ๐Ÿ”— discord.gg/m88xfymbk6

  39. huggingface/deep-rl-class โญ 4,027
    This repo contains the Hugging Face Deep Reinforcement Learning Course.

  40. rasbt/machine-learning-book โญ 3,906
    Code Repository for Machine Learning with PyTorch and Scikit-Learn
    ๐Ÿ”— sebastianraschka.com/books/#machine-learning-with-pytorch-and-scikit-learn

  41. huggingface/diffusion-models-class โญ 3,855
    Materials for the Hugging Face Diffusion Models Course

  42. cosmicpython/book โญ 3,472
    A Book about Pythonic Application Architecture Patterns for Managing Complexity. Cosmos is the Opposite of Chaos you see. O'R. wouldn't actually let us call it "Cosmic Python" tho.
    ๐Ÿ”— www.cosmicpython.com

  43. amanchadha/coursera-deep-learning-specialization โญ 3,445
    Notes, programming assignments and quizzes from all courses within the Coursera Deep Learning specialization offered by deeplearning.ai: (i) Neural Networks and Deep Learning; (ii) Improving Deep Neural Networks: Hyperparameter tuning, Regularization and Optimization; (iii) Structuring Machine Learning Projects; (iv...

  44. fluentpython/example-code-2e โญ 3,411
    Example code for Fluent Python, 2nd edition (O'Reilly 2022)
    ๐Ÿ”— amzn.to/3j48u2j

  45. mrdbourke/zero-to-mastery-ml โญ 3,074
    All course materials for the Zero to Mastery Machine Learning and Data Science course.
    ๐Ÿ”— dbourke.link/ztmmlcourse

  46. krzjoa/awesome-python-data-science โญ 2,693
    Probably the best curated list of data science software in Python.
    ๐Ÿ”— krzjoa.github.io/awesome-python-data-science

  47. chiphuyen/aie-book โญ 2,269
    Code for AI Engineering: Building Applications with Foundation Models (Chip Huyen 2025)

  48. gerdm/prml โญ 2,236
    Repository of notes, code and notebooks in Python for the book Pattern Recognition and Machine Learning by Christopher Bishop

  49. cgpotts/cs224u โญ 2,133
    Code for CS224u: Natural Language Understanding

  50. cerlymarco/MEDIUM_NoteBook โญ 2,096
    Repository containing notebooks of my posts on Medium

  51. trananhkma/fucking-awesome-python โญ 1,988
    awesome-python with :octocat: โญ and ๐Ÿด

  52. huggingface/cookbook โญ 1,816
    Community-driven practical examples of building AI applications and solving various tasks with AI using open-source tools and models.
    ๐Ÿ”— huggingface.co/learn/cookbook

  53. chandlerbang/awesome-self-supervised-gnn โญ 1,636
    Papers about pretraining and self-supervised learning on Graph Neural Networks (GNN).

  54. atcold/NYU-DLSP21 โญ 1,593
    NYU Deep Learning Spring 2021
    ๐Ÿ”— atcold.github.io/nyu-dlsp21

  55. patrickloeber/MLfromscratch โญ 1,383
    Machine Learning algorithm implementations from scratch.

  56. davidadsp/Generative_Deep_Learning_2nd_Edition โญ 1,198
    The official code repository for the second edition of the O'Reilly book Generative Deep Learning: Teaching Machines to Paint, Write, Compose and Play.
    ๐Ÿ”— www.oreilly.com/library/view/generative-deep-learning/9781098134174

  57. rasbt/LLM-workshop-2024 โญ 859
    A 4-hour coding workshop to understand how LLMs are implemented and used

  58. jackhidary/quantumcomputingbook โญ 822
    Companion site for the textbook Quantum Computing: An Applied Approach

  59. bayesianmodelingandcomputationinpython/BookCode_Edition1 โญ 510
    Bayesian Modeling and Computation in Python: open-access version of the text and the code examples in the book
    ๐Ÿ”— www.bayesiancomputationbook.com

  60. dylanhogg/awesome-python โญ 348
    ๐Ÿ Hand-picked awesome Python libraries and frameworks, organised by category
    ๐Ÿ”— www.awesomepython.org

Template

Template tools and libraries: cookiecutter repos, generators, quick-starts.

  1. tiangolo/full-stack-fastapi-template โญ 29,474
    Full stack, modern web application template. Using FastAPI, React, SQLModel, PostgreSQL, Docker, GitHub Actions, automatic HTTPS and more.

  2. cookiecutter/cookiecutter โญ 22,999
    A cross-platform command-line utility that creates projects from cookiecutters (project templates), e.g. Python package projects, C projects.
    ๐Ÿ”— pypi.org/project/cookiecutter

  3. drivendata/cookiecutter-data-science โญ 8,538
    A logical, reasonably standardized, but flexible project structure for doing and sharing data science work.
    ๐Ÿ”— cookiecutter-data-science.drivendata.org

  4. buuntu/fastapi-react โญ 2,296
    ๐Ÿš€ Cookiecutter Template for FastAPI + React Projects. Using PostgreSQL, SQLAlchemy, and Docker

  5. pyscaffold/pyscaffold โญ 2,151
    ๐Ÿ›  Python project template generator with batteries included
    ๐Ÿ”— pyscaffold.org

  6. cjolowicz/cookiecutter-hypermodern-python โญ 1,838
    Cookiecutter template for a Python package based on the Hypermodern Python article series.
    ๐Ÿ”— cookiecutter-hypermodern-python.readthedocs.io

  7. fmind/mlops-python-package โญ 1,103
    Best practices designed to support your MLOPs initiatives. You can use this package as part of your MLOps toolkit or platform e.g. Model Registry, Experiment Tracking, Realtime Inference
    ๐Ÿ”— fmind.github.io/mlops-python-package

  8. tezromach/python-package-template โญ 1,085
    ๐Ÿš€ Your next Python package needs a bleeding-edge project structure.

  9. martinheinz/python-project-blueprint โญ 963
    Blueprint/Boilerplate For Python Projects

  10. callmesora/llmops-python-package โญ 810
    Best practices designed to support your LLMOps initiatives. You can use this package as part of your LLMOps toolkit or platform e.g. Model Registry, Experiment Tracking, Realtime Inference

Terminal

Terminal and console tools and libraries: CLI tools, terminal based formatters, progress bars.

  1. willmcgugan/rich โญ 50,556
    Rich is a Python library for rich text and beautiful formatting in the terminal.
    ๐Ÿ”— rich.readthedocs.io/en/latest

  2. tqdm/tqdm โญ 29,182
    โšก A Fast, Extensible Progress Bar for Python and CLI
    ๐Ÿ”— tqdm.github.io

  3. google/python-fire โญ 27,374
    Python Fire is a library for automatically generating command line interfaces (CLIs) from absolutely any Python object.

  4. willmcgugan/textual โญ 27,183
    The lean application framework for Python. Build sophisticated user interfaces with a simple Python API. Run your apps in the terminal and a web browser.
    ๐Ÿ”— textual.textualize.io

  5. tiangolo/typer โญ 16,328
    Typer, build great CLIs. Easy to code. Based on Python type hints.
    ๐Ÿ”— typer.tiangolo.com

  6. pallets/click โญ 16,022
    Python composable command line interface toolkit
    ๐Ÿ”— click.palletsprojects.com

  7. saulpw/visidata โญ 8,029
    A terminal spreadsheet multitool for discovering and arranging data
    ๐Ÿ”— visidata.org

  8. tconbeer/harlequin โญ 4,119
    The SQL IDE for Your Terminal.
    ๐Ÿ”— harlequin.sh

  9. manrajgrover/halo โญ 2,913
    ๐Ÿ’ซ Beautiful spinners for terminal, IPython and Jupyter

  10. urwid/urwid โญ 2,850
    Console user interface library for Python (official repo)
    ๐Ÿ”— urwid.org

  11. textualize/trogon โญ 2,556
    Easily turn your Click CLI into a powerful terminal application

  12. darrenburns/elia โญ 1,985
    A snappy, keyboard-centric terminal user interface for interacting with large language models. Chat with ChatGPT, Claude, Llama 3, Phi 3, Mistral, Gemma and more.

  13. tmbo/questionary โญ 1,639
    Python library to build pretty command line user prompts โœจEasy to use multi-select lists, confirmations, free text prompts ...

  14. jazzband/prettytable โญ 1,433
    Display tabular data in a visually appealing ASCII table format
    ๐Ÿ”— pypi.org/project/prettytable

  15. 1j01/textual-paint โญ 983
    ๐ŸŽจ MS Paint in your terminal.
    ๐Ÿ”— pypi.org/project/textual-paint

Testing

Testing libraries: unit testing, load testing, acceptance testing, code coverage, browser automation, plugins.

  1. mitmproxy/mitmproxy โญ 37,773
    An interactive TLS-capable intercepting HTTP proxy for penetration testers and software developers.
    ๐Ÿ”— mitmproxy.org

  2. locustio/locust โญ 25,520
    Write scalable load tests in plain Python ๐Ÿš—๐Ÿ’จ
    ๐Ÿ”— locust.cloud

  3. pytest-dev/pytest โญ 12,380
    The pytest framework makes it easy to write small tests, yet scales to support complex functional testing
    ๐Ÿ”— pytest.org

  4. microsoft/playwright-python โญ 12,288
    Python version of the Playwright testing and automation library.
    ๐Ÿ”— playwright.dev/python

  5. robotframework/robotframework โญ 10,219
    Generic automation framework for acceptance testing and RPA
    ๐Ÿ”— robotframework.org

  6. seleniumbase/SeleniumBase โญ 9,253
    Python APIs for web automation, testing, and bypassing bot-detection.
    ๐Ÿ”— seleniumbase.io

  7. getmoto/moto โญ 7,757
    A library that allows you to easily mock out tests based on AWS infrastructure.
    ๐Ÿ”— docs.getmoto.org/en/latest

  8. hypothesisworks/hypothesis โญ 7,689
    Hypothesis is a powerful, flexible, and easy to use library for property-based testing.
    ๐Ÿ”— hypothesis.works

  9. newsapps/beeswithmachineguns โญ 6,461
    A utility for arming (creating) many bees (micro EC2 instances) to attack (load test) targets (web applications).
    ๐Ÿ”— apps.chicagotribune.com

  10. codium-ai/qodo-cover โญ 4,770
    Qodo-Cover: An AI-Powered Tool for Automated Test Generation and Code Coverage Enhancement! ๐Ÿ’ป๐Ÿค–๐Ÿงช๐Ÿž
    ๐Ÿ”— qodo.ai

  11. confident-ai/deepeval โญ 4,657
    The LLM Evaluation Framework
    ๐Ÿ”— docs.confident-ai.com

  12. spulec/freezegun โญ 4,264
    Let your Python tests travel through time

  13. getsentry/responses โญ 4,204
    A utility for mocking out the Python Requests library.

  14. tox-dev/tox โญ 3,735
    Command line driven CI frontend and development task automation tool.
    ๐Ÿ”— tox.wiki

  15. behave/behave โญ 3,244
    BDD, Python style.
    ๐Ÿ”— behave.readthedocs.io/en/latest

  16. nedbat/coveragepy โญ 3,076
    The code coverage tool for Python
    ๐Ÿ”— coverage.readthedocs.io

  17. kevin1024/vcrpy โญ 2,750
    Automatically mock your HTTP interactions to simplify and speed up testing

  18. cobrateam/splinter โญ 2,732
    splinter - python test framework for web applications
    ๐Ÿ”— splinter.readthedocs.org/en/stable/index.html

  19. pytest-dev/pytest-testinfra โญ 2,388
    With Testinfra you can write unit tests in Python to test actual state of your servers configured by management tools like Salt, Ansible, Puppet, Chef and so on.
    ๐Ÿ”— testinfra.readthedocs.io

  20. pytest-dev/pytest-mock โญ 1,887
    Thin-wrapper around the mock package for easier use with pytest
    ๐Ÿ”— pytest-mock.readthedocs.io/en/latest

  21. pytest-dev/pytest-cov โญ 1,809
    Coverage plugin for pytest.

  22. pytest-dev/pytest-xdist โญ 1,529
    pytest plugin for distributed testing and loop-on-failures testing modes.
    ๐Ÿ”— pytest-xdist.readthedocs.io

  23. pytest-dev/pytest-asyncio โญ 1,462
    Asyncio support for pytest
    ๐Ÿ”— pytest-asyncio.readthedocs.io

  24. taverntesting/tavern โญ 1,046
    A command-line tool and Python library and Pytest plugin for automated testing of RESTful APIs, with a simple, concise and flexible YAML-based syntax
    ๐Ÿ”— taverntesting.github.io

Machine Learning - Time Series

Machine learning and classical timeseries libraries: forecasting, seasonality, anomaly detection, econometrics.

  1. facebook/prophet โญ 18,814
    Tool for producing high quality forecasts for time series data that has multiple seasonality with linear or non-linear growth.
    ๐Ÿ”— facebook.github.io/prophet

  2. blue-yonder/tsfresh โญ 8,555
    Automatic extraction of relevant features from time series:
    ๐Ÿ”— tsfresh.readthedocs.io

  3. unit8co/darts โญ 8,306
    A python library for user-friendly forecasting and anomaly detection on time series.
    ๐Ÿ”— unit8co.github.io/darts

  4. sktime/sktime โญ 8,169
    A unified framework for machine learning with time series
    ๐Ÿ”— www.sktime.net

  5. facebookresearch/Kats โญ 5,619
    Kats, a kit to analyze time series data, a lightweight, easy-to-use, generalizable, and extendable framework to perform time series analysis, from understanding the key statistics and characteristics, detecting change points and anomalies, to forecasting future trends.

  6. awslabs/gluonts โญ 4,752
    Probabilistic time series modeling in Python
    ๐Ÿ”— ts.gluon.ai

  7. google-research/timesfm โญ 4,303
    TimesFM (Time Series Foundation Model) is a pretrained time-series foundation model developed by Google Research for time-series forecasting.
    ๐Ÿ”— research.google/blog/a-decoder-only-foundation-model-for-time-series-forecasting

  8. nixtla/statsforecast โญ 4,128
    Lightning โšก๏ธ fast forecasting with statistical and econometric models.
    ๐Ÿ”— nixtlaverse.nixtla.io/statsforecast

  9. tdameritrade/stumpy โญ 3,770
    STUMPY is a powerful and scalable Python library for modern time series analysis
    ๐Ÿ”— stumpy.readthedocs.io/en/latest

  10. salesforce/Merlion โญ 3,517
    Merlion: A Machine Learning Framework for Time Series Intelligence

  11. amazon-science/chronos-forecasting โญ 2,879
    Chronos: Pretrained Models for Probabilistic Time Series Forecasting
    ๐Ÿ”— arxiv.org/abs/2403.07815

  12. rjt1990/pyflux โญ 2,115
    Open source time series library for Python

  13. aistream-peelout/flow-forecast โญ 2,106
    Deep learning PyTorch library for time series forecasting, classification, and anomaly detection (originally for flood forecasting).
    ๐Ÿ”— flow-forecast.atlassian.net/wiki/spaces/ff/overview

  14. uber/orbit โญ 1,936
    A Python package for Bayesian forecasting with object-oriented design and probabilistic models under the hood.
    ๐Ÿ”— orbit-ml.readthedocs.io/en/stable

  15. alkaline-ml/pmdarima โญ 1,614
    A statistical library designed to fill the void in Python's time series analysis capabilities, including the equivalent of R's auto.arima function.
    ๐Ÿ”— www.alkaline-ml.com/pmdarima

  16. time-series-foundation-models/lag-llama โญ 1,342
    Lag-Llama: Towards Foundation Models for Probabilistic Time Series Forecasting

  17. winedarksea/AutoTS โญ 1,185
    Automated Time Series Forecasting

  18. autoviml/Auto_TS โญ 745
    Automatically build ARIMA, SARIMAX, VAR, FB Prophet and XGBoost Models on Time Series data sets with a Single Line of Code. Created by Ram Seshadri. Collaborators welcome.

  19. google/temporian โญ 685
    Temporian is an open-source Python library for preprocessing โšก and feature engineering ๐Ÿ›  temporal data ๐Ÿ“ˆ for machine learning applications ๐Ÿค–
    ๐Ÿ”— temporian.readthedocs.io

Typing

Typing libraries: static and run-time type checking, annotations.

  1. python/mypy โญ 18,890
    Optional static typing for Python
    ๐Ÿ”— www.mypy-lang.org

  2. microsoft/pyright โญ 13,781
    Static Type Checker for Python

  3. facebook/pyre-check โญ 6,910
    Performant type-checking for python.
    ๐Ÿ”— pyre-check.org

  4. python-attrs/attrs โญ 5,396
    Python Classes Without Boilerplate
    ๐Ÿ”— www.attrs.org

  5. google/pytype โญ 4,825
    A static type analyzer for Python code
    ๐Ÿ”— google.github.io/pytype

  6. instagram/MonkeyType โญ 4,825
    A Python library that generates static type annotations by collecting runtime types

  7. python/typeshed โญ 4,478
    Collection of library stubs for Python, with static types

  8. mtshiba/pylyzer โญ 2,639
    A fast, feature-rich static code analyzer & language server for Python
    ๐Ÿ”— mtshiba.github.io/pylyzer

  9. microsoft/pylance-release โญ 1,739
    Fast, feature-rich language support for Python. Documentation and issues for Pylance.

  10. agronholm/typeguard โญ 1,588
    Run-time type checker for Python

  11. patrick-kidger/torchtyping โญ 1,416
    Type annotations and dynamic checking for a tensor's shape, dtype, names, etc.

  12. robertcraigie/pyright-python โญ 194
    Python command line wrapper for pyright, a static type checker
    ๐Ÿ”— pypi.org/project/pyright

Utility

General utility libraries: miscellaneous tools, linters, code formatters, version management, package tools, documentation tools.

  1. yt-dlp/yt-dlp โญ 99,072
    A feature-rich command-line audio/video downloader
    ๐Ÿ”— discord.gg/h5mncfw63r

  2. home-assistant/core โญ 76,261
    ๐Ÿก Open source home automation that puts local control and privacy first.
    ๐Ÿ”— www.home-assistant.io

  3. abi/screenshot-to-code โญ 67,970
    Drop in a screenshot and convert it to clean code (HTML/Tailwind/React/Vue)
    ๐Ÿ”— screenshottocode.com

  4. python/cpython โญ 65,073
    The Python programming language
    ๐Ÿ”— www.python.org

  5. localstack/localstack โญ 57,515
    ๐Ÿ’ป A fully functional local AWS cloud stack. Develop and test your cloud & Serverless apps offline
    ๐Ÿ”— localstack.cloud

  6. faif/python-patterns โญ 40,871
    A collection of design patterns/idioms in Python

  7. mingrammer/diagrams โญ 40,212
    ๐ŸŽจ Diagram as Code for prototyping cloud system architectures
    ๐Ÿ”— diagrams.mingrammer.com

  8. ggerganov/whisper.cpp โญ 37,385
    Port of OpenAI's Whisper model in C/C++

  9. paul-gauthier/aider โญ 26,416
    Aider is a command line tool that lets you pair program with LLMs, to edit code stored in your local git repository
    ๐Ÿ”— aider.chat

  10. openai/openai-python โญ 24,355
    The official Python library for the OpenAI API
    ๐Ÿ”— pypi.org/project/openai

  11. keon/algorithms โญ 24,272
    Minimal examples of data structures and algorithms in Python

  12. modularml/mojo โญ 23,641
    The Mojo Programming Language
    ๐Ÿ”— docs.modular.com/mojo/manual

  13. norvig/pytudes โญ 23,279
    Python programs, usually short, of considerable difficulty, to perfect particular skills.

  14. pydantic/pydantic โญ 22,286
    Data validation using Python type hints
    ๐Ÿ”— docs.pydantic.dev

  15. squidfunk/mkdocs-material โญ 21,982
    Documentation that simply works
    ๐Ÿ”— squidfunk.github.io/mkdocs-material

  16. facebookresearch/audiocraft โญ 21,419
    Audiocraft is a library for audio processing and generation with deep learning. It features the state-of-the-art EnCodec audio compressor / tokenizer, along with MusicGen, a simple and controllable music generation LM with textual and melodic conditioning.

  17. blakeblackshear/frigate โญ 20,867
    NVR with realtime local object detection for IP cameras
    ๐Ÿ”— frigate.video

  18. chriskiehl/Gooey โญ 20,808
    Turn (almost) any Python command line program into a full GUI application with one line

  19. delgan/loguru โญ 20,673
    Python logging made (stupidly) simple

  20. micropython/micropython โญ 19,819
    MicroPython - a lean and efficient Python implementation for microcontrollers and constrained systems
    ๐Ÿ”— micropython.org

  21. mkdocs/mkdocs โญ 19,804
    Project documentation with Markdown.
    ๐Ÿ”— www.mkdocs.org

  22. rustpython/RustPython โญ 19,568
    A Python Interpreter written in Rust
    ๐Ÿ”— rustpython.github.io

  23. higherorderco/Bend โญ 17,992
    A massively parallel, high-level programming language
    ๐Ÿ”— higherorderco.com

  24. kivy/kivy โญ 17,985
    Open source UI framework written in Python, running on Windows, Linux, macOS, Android and iOS
    ๐Ÿ”— kivy.org

  25. ipython/ipython โญ 16,388
    Official repository for IPython itself. Other repos in the IPython organization contain things like the website, documentation builds, etc.
    ๐Ÿ”— ipython.readthedocs.org

  26. alievk/avatarify-python โญ 16,358
    Avatars for Zoom, Skype and other video-conferencing apps.

  27. openai/triton โญ 14,261
    Development repository for the Triton language and compiler
    ๐Ÿ”— triton-lang.org

  28. zulko/moviepy โญ 12,972
    Video editing with Python
    ๐Ÿ”— zulko.github.io/moviepy

  29. pyo3/pyo3 โญ 12,927
    Rust bindings for the Python interpreter
    ๐Ÿ”— pyo3.rs

  30. pyodide/pyodide โญ 12,653
    Pyodide is a Python distribution for the browser and Node.js based on WebAssembly
    ๐Ÿ”— pyodide.org/en/stable

  31. pytube/pytube โญ 12,579
    A lightweight, dependency-free Python library (and command-line utility) for downloading YouTube Videos.
    ๐Ÿ”— pytube.io

  32. caronc/apprise โญ 12,551
    Apprise - Push Notifications that work with just about every platform!
    ๐Ÿ”— hub.docker.com/r/caronc/apprise

  33. python-pillow/Pillow โญ 12,509
    The Python Imaging Library adds image processing capabilities to Python (Pillow is the friendly PIL fork)
    ๐Ÿ”— python-pillow.github.io

  34. nuitka/Nuitka โญ 12,454
    Nuitka is a Python compiler written in Python. It's fully compatible with Python 2.6, 2.7, 3.4-3.13. You feed it your Python app, it does a lot of clever things, and spits out an executable or extension module.
    ๐Ÿ”— nuitka.net

  35. dbader/schedule โญ 11,959
    Python job scheduling for humans.
    ๐Ÿ”— schedule.readthedocs.io

  36. ninja-build/ninja โญ 11,550
    Ninja is a small build system with a focus on speed.
    ๐Ÿ”— ninja-build.org

  37. secdev/scapy โญ 11,052
    Scapy: the Python-based interactive packet manipulation program & library.
    ๐Ÿ”— scapy.net

  38. asweigart/pyautogui โญ 10,790
    A cross-platform GUI automation Python module for human beings. Used to programmatically control the mouse & keyboard.

  39. magicstack/uvloop โญ 10,621
    Ultra fast asyncio event loop.

  40. pallets/jinja โญ 10,558
    A very fast and expressive template engine.
    ๐Ÿ”— jinja.palletsprojects.com

  41. aristocratos/bpytop โญ 10,453
    Linux/OSX/FreeBSD resource monitor

  42. cython/cython โญ 9,743
    The most widely used Python to C compiler
    ๐Ÿ”— cython.org

  43. aws/serverless-application-model โญ 9,402
    The AWS Serverless Application Model (AWS SAM) transform is a AWS CloudFormation macro that transforms SAM templates into CloudFormation templates.
    ๐Ÿ”— aws.amazon.com/serverless/sam

  44. paramiko/paramiko โญ 9,236
    The leading native Python SSHv2 protocol library.
    ๐Ÿ”— paramiko.org

  45. boto/boto3 โญ 9,175
    AWS SDK for Python
    ๐Ÿ”— aws.amazon.com/sdk-for-python

  46. facebookresearch/hydra โญ 9,011
    Hydra is a framework for elegantly configuring complex applications
    ๐Ÿ”— hydra.cc

  47. arrow-py/arrow โญ 8,780
    ๐Ÿน Better dates & times for Python
    ๐Ÿ”— arrow.readthedocs.io

  48. py-pdf/pypdf โญ 8,686
    A pure-python PDF library capable of splitting, merging, cropping, and transforming the pages of PDF files
    ๐Ÿ”— pypdf.readthedocs.io/en/latest

  49. xonsh/xonsh โญ 8,517
    ๐Ÿš Python-powered shell. Full-featured and cross-platform.
    ๐Ÿ”— xon.sh

  50. eternnoir/pyTelegramBotAPI โญ 8,252
    Python Telegram bot api.

  51. jasonppy/VoiceCraft โญ 8,083
    Zero-Shot Speech Editing and Text-to-Speech in the Wild

  52. kellyjonbrazil/jc โญ 8,020
    CLI tool and python library that converts the output of popular command-line tools, file-types, and common strings to JSON, YAML, or Dictionaries. This allows piping of output to tools like jq and simplifying automation scripts.

  53. googleapis/google-api-python-client โญ 7,957
    ๐Ÿ The official Python client library for Google's discovery based APIs.
    ๐Ÿ”— googleapis.github.io/google-api-python-client/docs

  54. theskumar/python-dotenv โญ 7,859
    Reads key-value pairs from a .env file and can set them as environment variables. It helps in developing applications following the 12-factor principles.
    ๐Ÿ”— saurabh-kumar.com/python-dotenv

  55. googlecloudplatform/python-docs-samples โญ 7,569
    Code samples used on cloud.google.com

  56. icloud-photos-downloader/icloud_photos_downloader โญ 7,549
    A command-line tool to download photos from iCloud

  57. google/latexify_py โญ 7,397
    A library to generate LaTeX expression from Python code.

  58. pygithub/PyGithub โญ 7,168
    Typed interactions with the GitHub API v3
    ๐Ÿ”— pygithub.readthedocs.io

  59. marshmallow-code/marshmallow โญ 7,090
    A lightweight library for converting complex objects to and from simple Python datatypes.
    ๐Ÿ”— marshmallow.readthedocs.io

  60. jd/tenacity โญ 7,002
    Retrying library for Python
    ๐Ÿ”— tenacity.readthedocs.io

  61. bndr/pipreqs โญ 7,001
    pipreqs - Generate pip requirements.txt file based on imports of any project. Looking for maintainers to move this project forward.

  62. hugapi/hug โญ 6,868
    Embrace the APIs of the future. Hug aims to make developing APIs as simple as possible, but no simpler.

  63. pyca/cryptography โญ 6,854
    cryptography is a package designed to expose cryptographic primitives and recipes to Python developers.
    ๐Ÿ”— cryptography.io

  64. sphinx-doc/sphinx โญ 6,789
    The Sphinx documentation generator
    ๐Ÿ”— www.sphinx-doc.org

  65. gorakhargosh/watchdog โญ 6,734
    Python library and shell utilities to monitor filesystem events.
    ๐Ÿ”— packages.python.org/watchdog

  66. openai/point-e โญ 6,617
    Point cloud diffusion for 3D model synthesis

  67. timdettmers/bitsandbytes โญ 6,592
    Accessible large language models via k-bit quantization for PyTorch.
    ๐Ÿ”— huggingface.co/docs/bitsandbytes/main/en/index

  68. ijl/orjson โญ 6,528
    Fast, correct Python JSON library supporting dataclasses, datetimes, and numpy

  69. agronholm/apscheduler โญ 6,484
    Task scheduling library for Python

  70. sdispater/pendulum โญ 6,346
    Python datetimes made easy
    ๐Ÿ”— pendulum.eustace.io

  71. pdfminer/pdfminer.six โญ 6,170
    Community maintained fork of pdfminer - we fathom PDF
    ๐Ÿ”— pdfminersix.readthedocs.io

  72. scikit-image/scikit-image โญ 6,165
    Image processing in Python
    ๐Ÿ”— scikit-image.org

  73. wireservice/csvkit โญ 6,077
    A suite of utilities for converting to and working with CSV, the king of tabular file formats.
    ๐Ÿ”— csvkit.readthedocs.io

  74. pytransitions/transitions โญ 5,908
    A lightweight, object-oriented finite state machine implementation in Python with many extensions

  75. rsalmei/alive-progress โญ 5,681
    A new kind of Progress Bar, with real-time throughput, ETA, and very cool animations!

  76. traceloop/openllmetry โญ 5,362
    Open-source observability for your LLM application, based on OpenTelemetry
    ๐Ÿ”— www.traceloop.com/openllmetry

  77. spotify/pedalboard โญ 5,358
    ๐ŸŽ› ๐Ÿ”Š A Python library for audio.
    ๐Ÿ”— spotify.github.io/pedalboard

  78. buildbot/buildbot โญ 5,305
    Python-based continuous integration testing framework; your pull requests are more than welcome!
    ๐Ÿ”— www.buildbot.net

  79. prompt-toolkit/ptpython โญ 5,259
    A better Python REPL

  80. pywinauto/pywinauto โญ 5,128
    Windows GUI Automation with Python (based on text properties)
    ๐Ÿ”— pywinauto.github.io

  81. pycqa/pycodestyle โญ 5,065
    Simple Python style checker in one Python file
    ๐Ÿ”— pycodestyle.pycqa.org

  82. tebelorg/RPA-Python โญ 5,057
    Python package for doing RPA

  83. pythonnet/pythonnet โญ 4,892
    Python for .NET is a package that gives Python programmers nearly seamless integration with the .NET Common Language Runtime (CLR) and provides a powerful application scripting tool for .NET developers.
    ๐Ÿ”— pythonnet.github.io

  84. jorgebastida/awslogs โญ 4,887
    AWS CloudWatch logs for Humansโ„ข

  85. comet-ml/opik โญ 4,851
    Opik is an open-source platform for evaluating, testing and monitoring LLM applications.
    ๐Ÿ”— www.comet.com/docs/opik

  86. pytoolz/toolz โญ 4,762
    A functional standard library for Python.
    ๐Ÿ”— toolz.readthedocs.org

  87. hhatto/autopep8 โญ 4,589
    A tool that automatically formats Python code to conform to the PEP 8 style guide.
    ๐Ÿ”— pypi.org/project/autopep8

  88. pyinvoke/invoke โญ 4,464
    Pythonic task management & command execution.
    ๐Ÿ”— pyinvoke.org

  89. bogdanp/dramatiq โญ 4,459
    A fast and reliable background task processing library for Python 3.
    ๐Ÿ”— dramatiq.io

  90. ashleve/lightning-hydra-template โญ 4,419
    PyTorch Lightning + Hydra. A very user-friendly template for ML experimentation. โšก๐Ÿ”ฅโšก

  91. blealtan/efficient-kan โญ 4,209
    An efficient pure-PyTorch implementation of Kolmogorov-Arnold Network (KAN).

  92. adafruit/circuitpython โญ 4,191
    CircuitPython - a Python implementation for teaching coding with microcontrollers
    ๐Ÿ”— circuitpython.org

  93. pyo3/maturin โญ 4,158
    Build and publish crates with pyo3, cffi and uniffi bindings as well as rust binaries as python packages
    ๐Ÿ”— maturin.rs

  94. ets-labs/python-dependency-injector โญ 4,131
    Dependency injection framework for Python
    ๐Ÿ”— python-dependency-injector.ets-labs.org

  95. evhub/coconut โญ 4,130
    Coconut (coconut-lang.org) is a variant of Python that adds on top of Python syntax new features for simple, elegant, Pythonic functional programming.
    ๐Ÿ”— coconut-lang.org

  96. miguelgrinberg/python-socketio โญ 4,080
    Python Socket.IO server and client

  97. pyinfra-dev/pyinfra โญ 4,051
    pyinfra turns Python code into shell commands and runs them on your servers. Execute ad-hoc commands and write declarative operations. Target SSH servers, local machine and Docker containers. Fast and scales from one server to thousands.
    ๐Ÿ”— pyinfra.com

  98. joblib/joblib โญ 3,962
    Computing with Python functions.
    ๐Ÿ”— joblib.readthedocs.org

  99. python-markdown/markdown โญ 3,877
    A Python implementation of John Gruberโ€™s Markdown with Extension support.
    ๐Ÿ”— python-markdown.github.io

  100. rspeer/python-ftfy โญ 3,848
    Fixes mojibake and other glitches in Unicode text, after the fact.
    ๐Ÿ”— ftfy.readthedocs.org

  101. zeromq/pyzmq โญ 3,790
    PyZMQ: Python bindings for zeromq
    ๐Ÿ”— zguide.zeromq.org/py:all

  102. more-itertools/more-itertools โญ 3,785
    More routines for operating on iterables, beyond itertools
    ๐Ÿ”— more-itertools.rtfd.io

  103. hynek/structlog โญ 3,741
    Simple, powerful, and fast logging for Python.
    ๐Ÿ”— www.structlog.org

  104. pydata/xarray โญ 3,696
    N-D labeled arrays and datasets in Python
    ๐Ÿ”— xarray.dev

  105. spotify/basic-pitch โญ 3,657
    A lightweight yet powerful audio-to-MIDI converter with pitch bend detection
    ๐Ÿ”— basicpitch.io

  106. pypi/warehouse โญ 3,651
    The Python Package Index
    ๐Ÿ”— pypi.org

  107. tartley/colorama โญ 3,605
    Simple cross-platform colored terminal text in Python

  108. osohq/oso โญ 3,483
    Deprecated: See README

  109. jorisschellekens/borb โญ 3,444
    borb is a library for reading, creating and manipulating PDF files in python.
    ๐Ÿ”— borbpdf.com

  110. suor/funcy โญ 3,399
    A fancy and practical functional tools

  111. pyserial/pyserial โญ 3,308
    Python serial port access library

  112. camelot-dev/camelot โญ 3,139
    A Python library to extract tabular data from PDFs
    ๐Ÿ”— camelot-py.readthedocs.io

  113. libaudioflux/audioFlux โญ 2,981
    A library for audio and music analysis, feature extraction.
    ๐Ÿ”— audioflux.top

  114. legrandin/pycryptodome โญ 2,926
    A self-contained cryptographic library for Python
    ๐Ÿ”— www.pycryptodome.org

  115. tox-dev/pipdeptree โญ 2,851
    A command line utility to display dependency tree of the installed Python packages
    ๐Ÿ”— pypi.python.org/pypi/pipdeptree

  116. lxml/lxml โญ 2,754
    The lxml XML toolkit for Python
    ๐Ÿ”— lxml.de

  117. liiight/notifiers โญ 2,690
    The easy way to send notifications
    ๐Ÿ”— notifiers.readthedocs.io

  118. whylabs/whylogs โญ 2,685
    An open-source data logging library for machine learning models and data pipelines. ๐Ÿ“š Provides visibility into data quality & model performance over time. ๐Ÿ›ก๏ธ Supports privacy-preserving data collection, ensuring safety & robustness. ๐Ÿ“ˆ
    ๐Ÿ”— whylogs.readthedocs.io

  119. cdgriffith/Box โญ 2,667
    Python dictionaries with advanced dot notation access
    ๐Ÿ”— github.com/cdgriffith/box/wiki

  120. pexpect/pexpect โญ 2,658
    A Python module for controlling interactive programs in a pseudo-terminal
    ๐Ÿ”— pexpect.readthedocs.io

  121. pydantic/logfire โญ 2,629
    Uncomplicated Observability for Python and beyond! ๐Ÿชต๐Ÿ”ฅ
    ๐Ÿ”— logfire.pydantic.dev/docs

  122. litl/backoff โญ 2,629
    Python library providing function decorators for configurable backoff and retry

  123. yaml/pyyaml โญ 2,625
    Canonical source repository for PyYAML

  124. scrapinghub/dateparser โญ 2,598
    python parser for human readable dates

  125. pypa/setuptools โญ 2,586
    Official project repository for the Setuptools build system
    ๐Ÿ”— pypi.org/project/setuptools

  126. jcrist/msgspec โญ 2,578
    A fast serialization and validation library, with builtin support for JSON, MessagePack, YAML, and TOML
    ๐Ÿ”— jcristharif.com/msgspec

  127. pyston/pyston โญ 2,504
    (No longer maintained) A faster and highly-compatible implementation of the Python programming language.
    ๐Ÿ”— www.pyston.org

  128. dosisod/refurb โญ 2,493
    A tool for refurbishing and modernizing Python codebases

  129. hgrecco/pint โญ 2,470
    Operate and manipulate physical quantities in Python
    ๐Ÿ”— pint.readthedocs.org

  130. nschloe/tikzplotlib โญ 2,459
    ๐Ÿ“Š Save matplotlib figures as TikZ/PGFplots for smooth integration into LaTeX.

  131. grantjenks/python-diskcache โญ 2,453
    Python disk-backed cache (Django-compatible). Faster than Redis and Memcached. Pure-Python.
    ๐Ÿ”— www.grantjenks.com/docs/diskcache

  132. dateutil/dateutil โญ 2,416
    Useful extensions to the standard Python datetime features

  133. tkem/cachetools โญ 2,411
    Various memoizing collections and decorators, including variants of the Python Standard Library's @lru_cache function decorator

  134. pndurette/gTTS โญ 2,384
    Python library and CLI tool to interface with Google Translate's text-to-speech API
    ๐Ÿ”— gtts.readthedocs.org

  135. rhettbull/osxphotos โญ 2,337
    Python app to work with pictures and associated metadata from Apple Photos on macOS. Also includes a package to provide programmatic access to the Photos library, pictures, and metadata.

  136. abseil/abseil-py โญ 2,320
    A collection of Python library code for building Python applications. The code is collected from Google's own Python code base, and has been extensively tested and used in production.

  137. kiminewt/pyshark โญ 2,308
    Python wrapper for tshark, allowing python packet parsing using wireshark dissectors

  138. pyparsing/pyparsing โญ 2,262
    Python library for creating PEG parsers

  139. astanin/python-tabulate โญ 2,244
    Pretty-print tabular data in Python, a library and a command-line utility. Repository migrated from bitbucket.org/astanin/python-tabulate.
    ๐Ÿ”— pypi.org/project/tabulate

  140. nateshmbhat/pyttsx3 โญ 2,222
    Offline Text To Speech synthesis for python

  141. ianmiell/shutit โญ 2,149
    Automation framework for programmers
    ๐Ÿ”— ianmiell.github.io/shutit

  142. grahamdumpleton/wrapt โญ 2,093
    A Python module for decorators, wrappers and monkey patching.

  143. seperman/deepdiff โญ 2,085
    DeepDiff: Deep Difference and search of any Python object/data. DeepHash: Hash of any object based on its contents. Delta: Use deltas to reconstruct objects by adding deltas together.
    ๐Ÿ”— zepworks.com

  144. google/gin-config โญ 2,071
    Gin provides a lightweight configuration framework for Python

  145. omry/omegaconf โญ 2,041
    Flexible Python configuration system. The last one you will ever need.

  146. mitmproxy/pdoc โญ 2,022
    API Documentation for Python Projects
    ๐Ÿ”— pdoc.dev

  147. pyfilesystem/pyfilesystem2 โญ 2,019
    Python's Filesystem abstraction layer
    ๐Ÿ”— www.pyfilesystem.org

  148. python-rope/rope โญ 1,995
    a python refactoring library

  149. julienpalard/Pipe โญ 1,989
    A Python library to use infix notation in Python

  150. numba/llvmlite โญ 1,983
    A lightweight LLVM python binding for writing JIT compilers
    ๐Ÿ”— llvmlite.pydata.org

  151. landscapeio/prospector โญ 1,970
    Inspects Python source files and provides information about type and location of classes, methods etc

  152. hbldh/bleak โญ 1,928
    A cross platform Bluetooth Low Energy Client for Python using asyncio

  153. carpedm20/emoji โญ 1,924
    emoji terminal output for Python

  154. pydoit/doit โญ 1,908
    CLI task management & automation tool
    ๐Ÿ”— pydoit.org

  155. chaostoolkit/chaostoolkit โญ 1,903
    Chaos Engineering Toolkit & Orchestration for Developers
    ๐Ÿ”— chaostoolkit.org

  156. pygments/pygments โญ 1,899
    Pygments is a generic syntax highlighter written in Python
    ๐Ÿ”— pygments.org

  157. open-telemetry/opentelemetry-python โญ 1,882
    OpenTelemetry Python API and SDK
    ๐Ÿ”— opentelemetry.io

  158. samuelcolvin/watchfiles โญ 1,875
    Simple, modern and fast file watching and code reload in Python.
    ๐Ÿ”— watchfiles.helpmanual.io

  159. p0dalirius/Coercer โญ 1,873
    A python script to automatically coerce a Windows server to authenticate on an arbitrary machine through 12 methods.
    ๐Ÿ”— podalirius.net

  160. home-assistant/supervisor โญ 1,855
    ๐Ÿก Home Assistant Supervisor
    ๐Ÿ”— home-assistant.io/hassio

  161. joowani/binarytree โญ 1,811
    Python Library for Studying Binary Trees
    ๐Ÿ”— binarytree.readthedocs.io

  162. konradhalas/dacite โญ 1,805
    Simple creation of data classes from dictionaries.

  163. mkdocstrings/mkdocstrings โญ 1,797
    ๐Ÿ“˜ Automatic documentation from sources, for MkDocs.
    ๐Ÿ”— mkdocstrings.github.io

  164. rubik/radon โญ 1,767
    Various code metrics for Python code
    ๐Ÿ”— radon.readthedocs.org

  165. kalliope-project/kalliope โญ 1,724
    Kalliope is a framework that will help you to create your own personal assistant.
    ๐Ÿ”— kalliope-project.github.io

  166. anthropics/anthropic-sdk-python โญ 1,669
    SDK providing access to Anthropic's safety-first language model APIs

  167. quodlibet/mutagen โญ 1,633
    Python module for handling audio metadata
    ๐Ÿ”— mutagen.readthedocs.io

  168. instagram/LibCST โญ 1,604
    A concrete syntax tree parser and serializer library for Python that preserves many aspects of Python's abstract syntax tree
    ๐Ÿ”— libcst.readthedocs.io

  169. facebookincubator/Bowler โญ 1,574
    Safe code refactoring for modern Python.
    ๐Ÿ”— pybowler.io

  170. imageio/imageio โญ 1,547
    Python library for reading and writing image data
    ๐Ÿ”— imageio.readthedocs.io

  171. fabiocaccamo/python-benedict โญ 1,528
    ๐Ÿ“˜ dict subclass with keylist/keypath support, built-in I/O operations (base64, csv, html, ini, json, pickle, plist, query-string, toml, xls, xml, yaml), s3 support and many utilities.

  172. lcompilers/lpython โญ 1,519
    Python compiler
    ๐Ÿ”— lpython.org

  173. nficano/python-lambda โญ 1,501
    A toolkit for developing and deploying serverless Python code in AWS Lambda.

  174. aws-samples/aws-glue-samples โญ 1,459
    AWS Glue code samples

  175. lidatong/dataclasses-json โญ 1,403
    Easily serialize Data Classes to and from JSON

  176. brandon-rhodes/python-patterns โญ 1,366
    Source code behind the python-patterns.guide site by Brandon Rhodes

  177. aio-libs/yarl โญ 1,363
    Yet another URL library
    ๐Ÿ”— yarl.aio-libs.org

  178. ossf/criticality_score โญ 1,356
    Gives criticality score for an open source project

  179. oracle/graalpython โญ 1,299
    GraalPy โ€“ A high-performance embeddable Python 3 runtime for Java
    ๐Ÿ”— www.graalvm.org/python

  180. pypy/pypy โญ 1,237
    PyPy is a very fast and compliant implementation of the Python language.
    ๐Ÿ”— pypy.org

  181. pyo3/rust-numpy โญ 1,179
    PyO3-based Rust bindings of the NumPy C-API

  182. ariebovenberg/whenever โญ 1,174
    โฐ Modern datetime library for Python
    ๐Ÿ”— whenever.rtfd.io

  183. pyfpdf/fpdf2 โญ 1,174
    Simple PDF generation for Python
    ๐Ÿ”— py-pdf.github.io/fpdf2

  184. pdoc3/pdoc โญ 1,147
    ๐Ÿ โžก๏ธ ๐Ÿ“œ Auto-generate API documentation for Python projects
    ๐Ÿ”— pdoc3.github.io/pdoc

  185. fsspec/filesystem_spec โญ 1,101
    A specification that python filesystems should adhere to.

  186. milvus-io/pymilvus โญ 1,078
    Python SDK for Milvus.

  187. c4urself/bump2version โญ 1,072
    Version-bump your software with a single command
    ๐Ÿ”— pypi.python.org/pypi/bump2version

  188. metachris/logzero โญ 1,024
    Robust and effective logging for Python 2 and 3.
    ๐Ÿ”— logzero.readthedocs.io

  189. extensityai/symbolicai โญ 1,005
    Compositional Differentiable Programming Library - divide-and-conquer approach to break down a complex problem into smaller, more manageable problems.

  190. fastai/fastcore โญ 997
    Python supercharged for the fastai library
    ๐Ÿ”— fastcore.fast.ai

  191. lastmile-ai/aiconfig โญ 985
    AIConfig saves prompts, models and model parameters as source control friendly configs. This allows you to iterate on prompts and model parameters separately from your application code.
    ๐Ÿ”— aiconfig.lastmileai.dev

  192. juanbindez/pytubefix โญ 973
    Python3 library for downloading YouTube Videos.
    ๐Ÿ”— pytubefix.readthedocs.io

  193. barracuda-fsh/pyobd โญ 914
    An OBD-II compliant car diagnostic tool

  194. qdrant/qdrant-client โญ 849
    Python client for Qdrant vector search engine
    ๐Ÿ”— qdrant.tech

  195. samuelcolvin/dirty-equals โญ 839
    Doing dirty (but extremely useful) things with equals.
    ๐Ÿ”— dirty-equals.helpmanual.io

  196. tox-dev/filelock โญ 808
    A platform independent file lock in Python, which provides a simple way of inter-process communication
    ๐Ÿ”— py-filelock.readthedocs.io

  197. modal-labs/modal-examples โญ 774
    Examples of programs built using Modal
    ๐Ÿ”— modal.com/docs

  198. open-telemetry/opentelemetry-python-contrib โญ 768
    OpenTelemetry instrumentation for Python modules
    ๐Ÿ”— opentelemetry.io

  199. pypa/build โญ 767
    A simple, correct Python build frontend
    ๐Ÿ”— build.pypa.io

  200. gefyrahq/gefyra โญ 710
    Blazingly-fast ๐Ÿš€, rock-solid, local application development โžก๏ธ with Kubernetes.
    ๐Ÿ”— gefyra.dev

  201. instagram/Fixit โญ 674
    Advanced Python linting framework with auto-fixes and hierarchical configuration that makes it easy to write custom in-repo lint rules.
    ๐Ÿ”— fixit.rtfd.io/en/latest

  202. argoproj-labs/hera โญ 645
    Hera makes Python code easy to orchestrate on Argo Workflows through native Python integrations. It lets you construct and submit your Workflows entirely in Python. โญ๏ธ Remember to star!
    ๐Ÿ”— hera.rtfd.io

  203. google/pyglove โญ 639
    Manipulating Python Programs

  204. platformdirs/platformdirs โญ 638
    A small Python module for determining appropriate platform-specific dirs, e.g. a "user data dir".
    ๐Ÿ”— platformdirs.readthedocs.io

  205. fastai/ghapi โญ 636
    A delightful and complete interface to GitHub's amazing API
    ๐Ÿ”— ghapi.fast.ai

  206. methexis-inc/terminal-copilot โญ 572
    A smart terminal assistant that helps you find the right command.

  207. chrishayuk/mcp-cli โญ 542
    A protocol-level CLI designed to interact with a Model Context Protocol server. The client allows users to send commands, query data, and interact with various resources provided by the server.

  208. steamship-core/steamship-langchain โญ 510
    steamship-langchain

  209. pypdfium2-team/pypdfium2 โญ 504
    Python bindings to PDFium
    ๐Ÿ”— pypdfium2.readthedocs.io

  210. neuml/annotateai โญ 287
    Automatically annotates papers using Large Language Models (LLMs)

Vizualisation

Vizualisation tools and libraries. Application frameworks, 2D/3D plotting, dashboards, WebGL.

  1. apache/superset โญ 64,196
    Apache Superset is a Data Visualization and Data Exploration Platform
    ๐Ÿ”— superset.apache.org

  2. streamlit/streamlit โญ 37,080
    Streamlit โ€” A faster way to build and share data apps.
    ๐Ÿ”— streamlit.io

  3. gradio-app/gradio โญ 35,732
    Build and share delightful machine learning apps, all in Python. ๐ŸŒŸ Star to support our work!
    ๐Ÿ”— www.gradio.app

  4. plotly/dash โญ 21,880
    Data Apps & Dashboards for Python. No JavaScript Required.
    ๐Ÿ”— plotly.com/dash

  5. danny-avila/LibreChat โญ 21,427
    LibreChat is a free, open source AI chat platform. This Web UI offers vast customization, supporting numerous AI providers, services, and integrations.
    ๐Ÿ”— librechat.ai

  6. matplotlib/matplotlib โญ 20,660
    matplotlib: plotting with Python
    ๐Ÿ”— matplotlib.org/stable

  7. bokeh/bokeh โญ 19,566
    Interactive Data Visualization in the browser, from Python
    ๐Ÿ”— bokeh.org

  8. plotly/plotly.py โญ 16,664
    The interactive graphing library for Python โœจ This project now includes Plotly Express!
    ๐Ÿ”— plotly.com/python

  9. mwaskom/seaborn โญ 12,814
    Statistical data visualization in Python
    ๐Ÿ”— seaborn.pydata.org

  10. visgl/deck.gl โญ 12,446
    WebGL2 powered visualization framework
    ๐Ÿ”— deck.gl

  11. marceloprates/prettymaps โญ 11,489
    A small set of Python functions to draw pretty maps from OpenStreetMap data. Based on osmnx, matplotlib and shapely libraries.

  12. altair-viz/altair โญ 9,555
    Declarative visualization library for Python
    ๐Ÿ”— altair-viz.github.io

  13. nvidia/TensorRT-LLM โญ 9,300
    TensorRT-LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and build TensorRT engines that contain state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs. TensorRT-LLM also contains components to create Python and C++ runtimes that execute those TensorRT...
    ๐Ÿ”— nvidia.github.io/tensorrt-llm

  14. lux-org/lux โญ 5,244
    Automatically visualize your pandas dataframe via a single print! ๐Ÿ“Š ๐Ÿ’ก

  15. renpy/renpy โญ 5,216
    The Ren'Py Visual Novel Engine
    ๐Ÿ”— www.renpy.org

  16. holoviz/panel โญ 5,004
    Panel: The powerful data exploration & web app framework for Python
    ๐Ÿ”— panel.holoviz.org

  17. man-group/dtale โญ 4,836
    Visualizer for pandas data structures
    ๐Ÿ”— alphatechadmin.pythonanywhere.com

  18. has2k1/plotnine โญ 4,114
    A Grammar of Graphics for Python
    ๐Ÿ”— plotnine.org

  19. residentmario/missingno โญ 4,017
    missingno provides a small toolset of flexible and easy-to-use missing data visualizations and utilities that allows you to get a quick visual summary of the completeness (or lack thereof) of your dataset.

  20. pyqtgraph/pyqtgraph โญ 3,965
    Fast data visualization and GUI tools for scientific / engineering applications
    ๐Ÿ”— www.pyqtgraph.org

  21. vispy/vispy โญ 3,368
    Main repository for Vispy
    ๐Ÿ”— vispy.org

  22. ml-tooling/opyrator โญ 3,117
    ๐Ÿช„ Turns your machine learning code into microservices with web API, interactive GUI, and more.
    ๐Ÿ”— opyrator-playground.mltooling.org

  23. netflix/flamescope โญ 3,036
    FlameScope is a visualization tool for exploring different time ranges as Flame Graphs.

  24. pyvista/pyvista โญ 2,912
    3D plotting and mesh analysis through a streamlined interface for the Visualization Toolkit (VTK)
    ๐Ÿ”— docs.pyvista.org

  25. facebookresearch/hiplot โญ 2,777
    HiPlot makes understanding high dimensional data easy
    ๐Ÿ”— facebookresearch.github.io/hiplot

  26. mckinsey/vizro โญ 2,775
    Vizro is a low-code toolkit for building high-quality data visualization apps.
    ๐Ÿ”— vizro.readthedocs.io/en/stable

  27. holoviz/holoviews โญ 2,744
    With Holoviews, your data visualizes itself.
    ๐Ÿ”— holoviews.org

  28. kozea/pygal โญ 2,688
    pygal is a dynamic SVG charting library written in python.
    ๐Ÿ”— www.pygal.org

  29. napari/napari โญ 2,268
    A fast, interactive, multi-dimensional image viewer for Python. It's designed for browsing, annotating, and analyzing large multi-dimensional images.
    ๐Ÿ”— napari.org

  30. marcomusy/vedo โญ 2,081
    A python module for scientific analysis of 3D data based on VTK and Numpy
    ๐Ÿ”— vedo.embl.es

  31. datapane/datapane โญ 1,389
    Build and share data reports in 100% Python
    ๐Ÿ”— datapane.com

  32. facultyai/dash-bootstrap-components โญ 1,139
    Bootstrap components for Plotly Dash
    ๐Ÿ”— dash-bootstrap-components.opensource.faculty.ai

  33. nomic-ai/deepscatter โญ 1,079
    Zoomable, animated scatterplots in the browser that scales over a billion points

  34. holoviz/holoviz โญ 858
    High-level tools to simplify visualization in Python.
    ๐Ÿ”— holoviz.org

  35. hazyresearch/meerkat โญ 836
    Creative interactive views of any dataset.

  36. anvaka/word2vec-graph โญ 707
    Exploring word2vec embeddings as a graph of nearest neighbors
    ๐Ÿ”— anvaka.github.io/pm/#/galaxy/word2vec-wiki?cx=-4651&cy=4492&cz=-1988&lx=-0.0915&ly=-0.9746&lz=-0.2030&lw=0.0237&ml=300&s=1.75&l=1&v=d50_clean_small

Web

Web related frameworks and libraries: webapp servers, WSGI, ASGI, asyncio, HTTP, REST, user management.

  1. django/django โญ 82,157
    The Web framework for perfectionists with deadlines.
    ๐Ÿ”— www.djangoproject.com

  2. tiangolo/fastapi โญ 80,493
    FastAPI framework, high performance, easy to learn, fast to code, ready for production
    ๐Ÿ”— fastapi.tiangolo.com

  3. pallets/flask โญ 68,703
    The Python micro framework for building web applications.
    ๐Ÿ”— flask.palletsprojects.com

  4. sherlock-project/sherlock โญ 62,208
    Hunt down social media accounts by username across social networks
    ๐Ÿ”— sherlockproject.xyz

  5. psf/requests โญ 52,451
    A simple, yet elegant, HTTP library.
    ๐Ÿ”— requests.readthedocs.io/en/latest

  6. tornadoweb/tornado โญ 21,824
    Tornado is a Python web framework and asynchronous networking library, originally developed at FriendFeed.
    ๐Ÿ”— www.tornadoweb.org

  7. reflex-dev/reflex โญ 21,556
    ๐Ÿ•ธ๏ธ Web apps in pure Python ๐Ÿ
    ๐Ÿ”— reflex.dev

  8. wagtail/wagtail โญ 18,690
    A Django content management system focused on flexibility and user experience
    ๐Ÿ”— wagtail.org

  9. huge-success/sanic โญ 18,224
    Accelerate your web app development | Build fast. Run fast.
    ๐Ÿ”— sanic.dev

  10. pyscript/pyscript โญ 18,178
    A framework that allows users to create rich Python applications in the browser using HTML's interface and the power of Pyodide, WASM, and modern web technologies.
    ๐Ÿ”— pyscript.net

  11. vincigit00/Scrapegraph-ai โญ 17,801
    ScrapeGraphAI is a web scraping python library that uses LLM and direct graph logic to create scraping pipelines for websites and local documents
    ๐Ÿ”— scrapegraphai.com

  12. aio-libs/aiohttp โญ 15,381
    Asynchronous HTTP client/server framework for asyncio and Python
    ๐Ÿ”— docs.aiohttp.org

  13. encode/httpx โญ 13,648
    A next generation HTTP client for Python. ๐Ÿฆ‹
    ๐Ÿ”— www.python-httpx.org

  14. getpelican/pelican โญ 12,711
    Static site generator that supports Markdown and reST syntax. Powered by Python.
    ๐Ÿ”— getpelican.com

  15. flet-dev/flet โญ 12,265
    Flet enables developers to easily build realtime web, mobile and desktop apps in Python. No frontend experience required.
    ๐Ÿ”— flet.dev

  16. zauberzeug/nicegui โญ 10,882
    Create web-based user interfaces with Python. The nice way.
    ๐Ÿ”— nicegui.io

  17. aws/chalice โญ 10,752
    Python Serverless Microframework for AWS

  18. encode/starlette โญ 10,555
    The little ASGI framework that shines. ๐ŸŒŸ
    ๐Ÿ”— www.starlette.io

  19. benoitc/gunicorn โญ 9,965
    gunicorn 'Green Unicorn' is a WSGI HTTP Server for UNIX, fast clients and sleepy applications.
    ๐Ÿ”— www.gunicorn.org

  20. falconry/falcon โญ 9,585
    The no-magic web API and microservices framework for Python developers, with an emphasis on reliability and performance at scale.
    ๐Ÿ”— falcon.readthedocs.io

  21. encode/uvicorn โญ 8,835
    An ASGI web server, for Python. ๐Ÿฆ„
    ๐Ÿ”— www.uvicorn.org

  22. bottlepy/bottle โญ 8,525
    bottle.py is a fast and simple micro-framework for python web-applications.
    ๐Ÿ”— bottlepy.org

  23. graphql-python/graphene โญ 8,140
    GraphQL framework for Python
    ๐Ÿ”— graphene-python.org

  24. reactive-python/reactpy โญ 7,937
    ReactPy is a library for building user interfaces in Python without Javascript
    ๐Ÿ”— reactpy.dev

  25. vitalik/django-ninja โญ 7,684
    ๐Ÿ’จ Fast, Async-ready, Openapi, type hints based framework for building APIs
    ๐Ÿ”— django-ninja.dev

  26. pyeve/eve โญ 6,712
    REST API framework designed for human beings
    ๐Ÿ”— python-eve.org

  27. pallets/werkzeug โญ 6,694
    The comprehensive WSGI web application library.
    ๐Ÿ”— werkzeug.palletsprojects.com

  28. starlite-api/litestar โญ 5,949
    Production-ready, Light, Flexible and Extensible ASGI API framework | Effortlessly Build Performant APIs
    ๐Ÿ”— litestar.dev

  29. webpy/webpy โญ 5,899
    web.py is a web framework for python that is as simple as it is powerful.
    ๐Ÿ”— webpy.org

  30. fastapi-users/fastapi-users โญ 4,855
    Ready-to-use and customizable users management for FastAPI
    ๐Ÿ”— fastapi-users.github.io/fastapi-users

  31. stephenmcd/mezzanine โญ 4,772
    CMS framework for Django
    ๐Ÿ”— mezzanine.jupo.org

  32. nameko/nameko โญ 4,729
    A microservices framework for Python that lets service developers concentrate on application logic and encourages testability.
    ๐Ÿ”— www.nameko.io

  33. pywebio/PyWebIO โญ 4,638
    Write interactive web app in script way.
    ๐Ÿ”— pywebio.readthedocs.io

  34. strawberry-graphql/strawberry โญ 4,140
    A GraphQL library for Python that leverages type annotations ๐Ÿ“
    ๐Ÿ”— strawberry.rocks

  35. h2oai/wave โญ 4,044
    H2O Wave is a software stack for building beautiful, low-latency, realtime, browser-based applications and dashboards entirely in Python/R without using HTML, Javascript, or CSS.
    ๐Ÿ”— wave.h2o.ai

  36. pylons/pyramid โญ 4,006
    Pyramid - A Python web framework
    ๐Ÿ”— trypyramid.com

  37. websocket-client/websocket-client โญ 3,616
    WebSocket client for Python
    ๐Ÿ”— github.com/websocket-client/websocket-client

  38. unbit/uwsgi โญ 3,477
    uWSGI application server container
    ๐Ÿ”— projects.unbit.it/uwsgi

  39. pallets/quart โญ 3,144
    An async Python micro framework for building web applications.
    ๐Ÿ”— quart.palletsprojects.com

  40. fastapi-admin/fastapi-admin โญ 3,047
    A fast admin dashboard based on FastAPI and TortoiseORM with tabler ui, inspired by Django admin
    ๐Ÿ”— fastapi-admin-docs.long2ice.io

  41. flipkart-incubator/Astra โญ 2,538
    Automated Security Testing For REST API's

  42. dot-agent/nextpy โญ 2,256
    ๐Ÿค–Self-Modifying Framework from the Future ๐Ÿ”ฎ World's First AMS
    ๐Ÿ”— dotagent.ai

  43. masoniteframework/masonite โญ 2,238
    The Modern And Developer Centric Python Web Framework. Be sure to read the documentation and join the Discord channel for questions: https://discord.gg/TwKeFahmPZ
    ๐Ÿ”— docs.masoniteproject.com

  44. python-restx/flask-restx โญ 2,181
    Fork of Flask-RESTPlus: Fully featured framework for fast, easy and documented API development with Flask
    ๐Ÿ”— flask-restx.readthedocs.io/en/latest

  45. s3rius/FastAPI-template โญ 2,126
    Feature rich robust FastAPI template.

  46. neoteroi/BlackSheep โญ 2,084
    Fast ASGI web framework for Python
    ๐Ÿ”— www.neoteroi.dev/blacksheep

  47. dmontagu/fastapi-utils โญ 2,004
    Reusable utilities for FastAPI: a number of utilities to help reduce boilerplate and reuse common functionality across projects
    ๐Ÿ”— fastapiutils.github.io/fastapi-utils

  48. cherrypy/cherrypy โญ 1,874
    CherryPy is a pythonic, object-oriented HTTP framework. https://cherrypy.dev
    ๐Ÿ”— docs.cherrypy.dev

  49. indico/indico โญ 1,821
    Indico - A feature-rich event management system, made @ CERN, the place where the Web was born.
    ๐Ÿ”— getindico.io

  50. jordaneremieff/mangum โญ 1,794
    An adapter for running ASGI applications in AWS Lambda to handle Function URL, API Gateway, ALB, and Lambda@Edge events
    ๐Ÿ”— mangum.fastapiexpert.com

  51. wtforms/wtforms โญ 1,525
    A flexible forms validation and rendering library for Python.
    ๐Ÿ”— wtforms.readthedocs.io

  52. long2ice/fastapi-cache โญ 1,452
    fastapi-cache is a tool to cache fastapi response and function result, with backends support redis and memcached.
    ๐Ÿ”— github.com/long2ice/fastapi-cache

  53. awtkns/fastapi-crudrouter โญ 1,450
    A dynamic FastAPI router that automatically creates CRUD routes for your models
    ๐Ÿ”— fastapi-crudrouter.awtkns.com

  54. rstudio/py-shiny โญ 1,373
    Shiny for Python
    ๐Ÿ”— shiny.posit.co/py

  55. whitphx/stlite โญ 1,318
    A port of Streamlit to WebAssembly, powered by Pyodide.
    ๐Ÿ”— edit.share.stlite.net

  56. magicstack/httptools โญ 1,227
    Fast HTTP parser

  57. koxudaxi/fastapi-code-generator โญ 1,120
    This code generator creates FastAPI app from an openapi file.

  58. aeternalis-ingenium/FastAPI-Backend-Template โญ 692
    A backend project template with FastAPI, PostgreSQL with asynchronous SQLAlchemy 2.0, Alembic for asynchronous database migration, and Docker.


Interactive version: www.awesomepython.org, Hugging Face Dataset: awesome-python

Please raise a new issue to suggest a Python repo that you would like to see added.

1,709 hand-picked awesome Python libraries and frameworks, updated 06 Feb 2025

Hits