Suggested data sources for data science students at Thinkful.
Title | Description | Prospective Uses |
---|---|---|
American Community Survey 5-year Data (2009-2015) | Survey data from the U.S. Census Bureau available via API | Exploratory Analysis, Data Visualizations, Predictive Modeling |
General Social Survey, 2012 Merged Data | Panel data from three iterations of the General Social Survey of the United States, covering attitudes toward a variety of subjects. Contains both panel data and cross-sectional data. | Exploratory Analysis, Data Visualizations, Predictive Modeling |
Kaggle | Kaggle data sets, from a variety of sources and industries | Industry based modeling/analytics |
ICPSR | Political data sets | Economics, Health? |
UC Irvine ML Data | Machine Learning Repo from UC Irvine. Tons of data sets here with an emphasis on scale | Predictive Modeling, particularly unsupervised |
UN Data | Government data sets from the UN | All |
Quora Question | Great collection of various datasets, some explicitly called out here as well | All |
AWS Public Data Sets | Data sets on AWS infrastructure | Big Data |
Data.GOV | US government open data sets | Economics, Gov't |
GitHub Repo | Awesome GitHub repo with many, many datasets sorted by subject | All |
Highly specalized data sets:
Title | Description | Prospective Uses |
---|---|---|
Rijksmuseum API | A public facing API from the Rijksmuseum in Amsterdam | Exploratory Analysis, Data Visualizaitons |
SFMOMA API | An API from the SF MOMA | Exploratory Analysis, Data Visualizaitons |