Here I build a simple geospatial data pipeline to fetch data from ArcGIS Feature Servers and write / read it to GeoParquet in Cloudflare R2. (This is an example - please read my write-up for a thorough walk through of my rationale and development process.)
- Building a po' boy's spatial data pipeline from scratch with Dagster, GeoParquet, & R2
- Dagster - Getting Started
- GeoParquet.org
- Cloudflare R2
Visit the Dagster Getting started page. Dagster supports Python 3.8 through 3.12. Ensure you have one of the supported Python versions installed before proceeding.
First, Clone this Dagster repository.
git clone https://github.com/alexlowellmartin/poboy-pipeline-example && cd poboy-pipeline-example
Second, install your Dagster code location as a Python package. By using the --editable flag, pip will install your Python package in "editable mode" so that as you develop, local code changes will automatically apply.
pip install -e ".[dev]"
Then, start the Dagster UI web server:
dagster dev
Open http://localhost:3000 with your browser to see the project.
You can start writing assets in poboy_pipeline_example/assets.py
. The assets are automatically loaded into the Dagster code location as you define them.