- Rio de Janeiro, Brazil
- in/machado-luiz
Stars
A lightweight data processing framework built on DuckDB and 3FS.
Full stack, modern web application template. Using FastAPI, React, SQLModel, PostgreSQL, Docker, GitHub Actions, automatic HTTPS and more.
OpenMetadata is a unified metadata platform for data discovery, data observability, and data governance powered by a central metadata repository, in-depth column level lineage, and seamless team co…
The Metadata Platform for your Data and AI Stack
The easy-to-use open source Business Intelligence and Embedded Analytics tool that lets everyone work with data 📊
dbt enables data analysts and engineers to transform their data using the same practices that software engineers use to build applications.
Efficient data transformation and modeling framework that is backwards compatible with dbt.
Limbo is a project to build the modern evolution of SQLite.
Get up and running with Llama 3.3, DeepSeek-R1, Phi-4, Gemma 2, and other large language models.
Apache Superset is a Data Visualization and Data Exploration Platform
The leading data integration platform for ETL / ELT data pipelines from APIs, databases & files to data warehouses, data lakes & data lakehouses. Both self-hosted and Cloud-hosted.
The official home of the Presto distributed SQL query engine for big data
Official repository of Trino, the distributed SQL query engine for big data, formerly known as PrestoSQL (https://trino.io)
ClickHouse® is a real-time analytics database management system
The world's fastest open query engine for sub-second analytics both on and off the data lakehouse. With the flexibility to support nearly any scenario, StarRocks provides best-in-class performance …
Blazing-fast query execution engine speaks Apache Spark language and has Arrow-DataFusion at its core.
Gluten is a middle layer responsible for offloading JVM-based SQL engines' execution to native engines.
A framework for managing and maintaining multi-language pre-commit hooks.
An orchestration platform for the development, production, and observation of data assets.
Prefect is a workflow orchestration framework for building resilient data pipelines in Python.
⚡ Workflow Automation Platform. Orchestrate & Schedule code in any language, run anywhere, 500+ plugins. Alternative to Zapier, Rundeck, Camunda, Airflow...
🧙 Build, run, and manage data pipelines for integrating and transforming data.
Apache Airflow - A platform to programmatically author, schedule, and monitor workflows
An extremely fast Python linter and code formatter, written in Rust.
An extremely fast Python package and project manager, written in Rust.
Apache DataFusion Comet Spark Accelerator
An open-source storage framework that enables building a Lakehouse architecture with compute engines including Spark, PrestoDB, Flink, Trino, and Hive and APIs