Overview: Rust Github Insights.

This is a simple batch processing pipeline project. The dashbord shows statistics for various activities on the Github repositories belonging to the Rust Language Project. The raw data is obtained from the Github's Activity API.

Architecture

The pipeline starts with extractor, An AWS Lambda function, fetching event data from Github's API and writing newline delimited JSON to a file on AWS S3. Then processor, an Apache Spark application, reads the file on S3, aggregates the data and writes to an AWS DynamoDB table. Dashboard is a Spring Boot application that provides a web UI and a REST API endpoint for accessing the processed data stored on DynamoDB.

The biggest improvement that can be made to this architecture is to automate the data pipeline using a workflow orchestration tool like Apache Airflow.

Tech Stack

Java 11
Spring Boot - Used to serve the dashboard UI and provide a REST endpoint for data.
Bulma - CSS library for styling the UI of the dashboard.
Apache Spark - For distributed batch processing of the raw data from Github's API.
ChartJS - Used to create charts for visualizing the data.
AWS Services
- Lambda - Runs the extractor program that populates an S3 bucket with raw Github data.
- S3 - Storage for raw Github data before the Spark job processes them.
- DynamoDB - Persistence for the final output of the Spark job.
- Elastic Beanstalk - Provides a way to conveniently deploy dashboard on EC2.

Name		Name	Last commit message	Last commit date
Latest commit History 13 Commits
.github/workflows		.github/workflows
extractor-function		extractor-function
spark-processor		spark-processor
web-dashboard		web-dashboard
.gitignore		.gitignore
LICENSE.txt		LICENSE.txt
README.md		README.md
pom.xml		pom.xml
rai-arch.png		rai-arch.png
rai.gif		rai.gif

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Overview: Rust Github Insights.

Architecture

Tech Stack

AWS Services

About

Releases

Packages

Languages

License

adeshinaO/rustlang-github-insights

Folders and files

Latest commit

History

Repository files navigation

Overview: Rust Github Insights.

Architecture

Tech Stack

AWS Services

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages