Skip to content

Latest commit

 

History

History
31 lines (26 loc) · 1.74 KB

File metadata and controls

31 lines (26 loc) · 1.74 KB

Stock Market Analytics ELT Workflow

Description

A Data Pipeline for Automating the ELT workflow of the stock market data and then building a BI product on top of this data, whether it's a dashboard or a forecast predictive model.

Data Stack: drawing drawing drawing drawing

General Pipeline Structure

  • The pipeline consists of four layers that data should go through:
    • Extraction and Load
    • Validation and quality gates
    • Transformation
    • BI

TO-DO

  • Save stocks tickers data from Yahoo Finance to Google BigQuery
  • Create a Great Expectation Suite and Checkpoints using the Great Expectation package to validate and test the loaded data (Validation)
  • Setup A dbt-core project as a transformation layer above the source data
  • Automate styling and formatting by adding the following tasks (quality gates):
    • a task for formatting python code using black lib
    • a task to check the linting using pylint, yamllint, sqlfluff
    • a task to run unit tests using pytest, pytest-cov
  • Build the stocks transformations with dbt (Transformation)
  • Add dbt tests (+freshness to the source) to all transformations
  • Add python unit testing to test core python scripts functionality
  • Create a dashboard to share those transformations (BI)