layout | title | permalink |
---|---|---|
page |
Portfolio |
/portfolio/ |
A snippet of personal (and some work-related) things I've been working on. I'm fortunate that most of day job is open source, you can check out my daily contributions on my github.
I created and maintain a bluesky bot that posts a daily forecast for how busy the airport will be using a simple ARIMA model trained on historical TSA checkpoint volume data. As part of this work, I set up an ETL that processes hundreds of thousands of pages of PDFs posted to the TSA website.
{: .center} Data flow diagram of the forecast bot
<script async src="https://embed.bsky.app/static/embed.js" charset="utf-8"></script> {: .center} *Example prediction*Below normal wait times expected tomorrow: Higher than 0% of days.
— 🤖 Austin Airport Daily Wait Time Forecasting (@forecastaus.bsky.social) January 21, 2025 at 5:59 PM
[image or embed]
One topic I frequently visualize is elections. My atx-elections-data repo contains several examples of code I have written to visualize elections mostly in Texas.
{: .center} Precinct-level election shifts
For 2024's early voting period I set up an ETL script that scraped live voter turnout data and plotted it alongside a comparison to the the 2020 election.
{: .center} 2024 live voter turnout comparison
Using a tool called flowmap.blue I was able to quickly visualize multiple years of docked bicycle data. It is featured on flowmap.blue's examples page.
Created with: Python, Google sheets
<iframe width="100%" height="600" src="https://www.flowmap.blue/1qIMB8jTEGMO6u1sLcuu5vQvP90jbENt904zMCV0A3DI/82227dc/embed" frameborder="0" allowfullscreen></iframe>Programming:
- Python (expert)
- R (intermediate)
- Javascript (intermediate)
Data Engineering:
- Extract transform load (ETL) scripting with Python and dbt
- Building and deploying Docker containers
- SQL (Postgres, Oracle) for database administration and data extraction/transformation
- Cloud orchestration with Prefect, on-premises orchestration with Apache Airflow
- Amazon Web Services (AWS): S3, EC2
- Google Cloud Service (GCS): BigQuery, Cloud Functions, Cloud Storage
Data Science/Machine Learning
- Machine learning: Pytorch, XGBoost, Scikit learn
- Experience applying deep learning, PCA, and supervised learning to real world problems
Business Intelligence
- Power BI (expert)
- Hex (expert)
- Tableau (intermediate)
- MicroStrategy (intermediate)
- Geospatial analysis and mapping with ArcGIS Online, geopandas, postGIS
Education:
- Master's of Science in Data Science, The University of Texas at Austin. Dec 2024
- Bachelor's of Science in Aerospace Engineering, The University of Texas at Austin. Dec 2018
Certifications:
Master's Coursework:
- DSC 385T: Data Science for Health Discovery and Innovation
- DSC 391L: Principles of Machine Learning
- DSC 394R: Reinforcement Learning
- CS 388: Natural Language Processing
- CS 394D: Deep Learning
- CS 395T: Data Structures and Algorithms
- DSC 385: Data Exploration and Visualization
- DSC 383: Advanced Predictive Models for Complex Data
- DSC 382: Foundations of Regression and Predictive Modeling
- DSC 381: Probability and Simulation-Based Inference