I wanted to compare the distribution of quality original content produced by streaming media companies.
The streaming media companies I want to look at are your major players:
- Netflix
- Amazon Prime Video
- Hulu
I plan to do this by: A. Extracting lists of Original Content (OC) from Wikipedia B. Create an "audience" and "Critic" dataset from scraping data off of Rotten Tomatoes C. Using simple exploratory data analysis to look at quality distribution of each.
Some Potential Limitations:
- Quantity of Original content
- Number of reviews on Rotten Tomatoes (is it 1 or 1,000?)
Potential Extensions:
- Look at user ROI
- Look at how user scores change overtime (from release onwards)
- Can we detect anomalies?