A simple example of Data Pipeline using apache-airflow (Orchestrator) and MinIO(Object Storage like s3).
Below is a design of the project. (The drawio file can be found in docs/architecture.drawio
:
- We need to create and
.env
from sample.env:
cp sample.env .env
- Add the Twitter Bearer Token in the
.env
file as below:
TWITTER_BEARER_TOKEN="vNVxBVjj-0yhF!Ipc-p7Nrzl7C2wISOI6BLXVk087/jJS4auIp0SKSXI/7npGy1kl7xDXxRuJ55Lor5FHI!6!!a5v0!IrxCDYQDEgMBQzOZivgIEpQJsvC4A0nqFbqxA"
- We can simply run the pipeline using
docker-compose
.
To start
docker compose up -d
To shutdown
docker compose down
- Then we can connect the below respectively:
- Apache-Airflow: http://localhost:8080
- MinIO Console: http://localhost:9090
This project is licensed under the MIT License - see the LICENSE file for details.
See as you fit.
If you have any questions or would like to get in touch, you can email: [email protected] OR twitter