A collection of Data Science resources and issues encountered In case something is missing or a link is not working, please create an issue here.
- Ace the Data Science Interview
- Data Science for Business: What You Need to Know about Data Mining and Data-Analytic Thinking 1st Edition
- Storytelling with Data: A Data Visualization Guide for Business Professionals 1st Edition Free
- Python Data Science Handbook Free
- The Complete Machine Learning Course with Python
- Data Management with Databricks: Big Data with Delta Lakes
- Notched Box Plots
- Boxplot and its pitfalls
- Histograms vs. KDEs Explained
- From Histograms to Kernel Density Estimation
- Statistical Tests I using R
- Statistical Tests II using R
- Feature Selection
- Colinearity
- What is multicollinearity and how to remove it?
- Variance inflation factor
- Detecting Multicollinearity with VIF – Python
- A Guide to Multicollinearity & VIF in Regression
- Scikit-Learn Warning: High Collinearity Detected in Features
- A Python library to remove collinearity
- Numerical Issues due to multicolinearity: VIF and Condition Number
- UCLA: Lesson 3 Logistic Regression Diagnostics
- Multicolinearity and Condition number of logistic regresison Answer by EdM
- Missing Data
- Categorical Variables Encoding
- Align Train & Test Data
- Least Squares
- Python Libraries
- Pandas
- Numpy
- Scikit-Learn
- lazypredict Build a lot of basic models without much code and helps understand which models works better without any parameter tuning.
- PyTorch
- Tensorflow - Keras
- Tensorboard A tool for providing the measurements and visualizations needed during the machine learning workflow.
- PyTables A Python package for managing large amounts of data using HDF5 library and NumPy.
- labelme: Image Polygonal Annotation with Python Useful for supervised learning on imaging data.
- Google Colab A free cloud service that lets you create and share interactive notebooks with code, text, and visualizations. You can use Colab for data science, machine learning, and AI applications, with access to GPUs, TPUs, and Gemini models.
- Python Libraries
- R Libraries