PROJECT 1 - Data Modeling with Postgres

This project has been created for the new startup called Sparkify.

The purpose of the project is understanding what songs users are listening and analyzing the data based on songs and user activity on the app.

Milestones of this project:

Build an ETL pipeline using Python
- Define fact and dimension tables for a star schema
- Create an ETL pipeline to transfer data from files to tables

Database Tables:

songs: fact data of the songs in the app
artists: fact data of the artists in the app
users: fact data of the app users
time: timestamp of user activity data
songplays: user listening activity data

Procedure:

Set up:

Install postgres and psycopg2-binary
Point postgres to the host port number and host name

Process:

Run create_tables.py to create the schema.
1. This will run sql_queries.py script to create & drop tables.
Run etl.py to create to populate the tables.
1. This will populate the tables with sql_queries.py

Files in this Repo:

Data
- log_data
- song_data
README.md
create_tables.py
etl.ipynb
etl.py
sql_queries.py
test.ipynb

EX Query:

    SELECT Count(*) agent_count, user_agent FROM SongPlays
    GROUP BY user_agent

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

PROJECT 1 - Data Modeling with Postgres

Database Tables:

Procedure:

Set up:

Process:

Files in this Repo:

About

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 24 Commits
data		data
README.md		README.md
create_tables.py		create_tables.py
etl.ipynb		etl.ipynb
etl.py		etl.py
sql_queries.py		sql_queries.py
test.ipynb		test.ipynb

0xCakin/Data_Modeling_with_Postgres

Folders and files

Latest commit

History

Repository files navigation

PROJECT 1 - Data Modeling with Postgres

Database Tables:

Procedure:

Set up:

Process:

Files in this Repo:

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages