Daily Dose of Data Science is a publication on Substack that brings together intriguing frameworks, libraries, technologies, and tips that make the life cycle of a Data Science project effortless.
This repository is a collection of all the code snippets presented in my publication. If you want to receive these tips in your mailbox daily, you can subscribe to my Substack newsletter.
To download the tips listed here, you can clone this repo.
git clone https://github.com/ChawlaAvi/Daily-Dose-of-Data-Science
- Pandas
- Jupyter Tips
- Python
- Plotting
- NumPy
- Memory Optimization
- Cool Tools
- Run-time Optimization
- Sklearn
- Debugging
- Missing Data
- ML-AI News
- Machine Learning
- Statistics
- Testing
- Terminal
- Documents
Title | Notebook | Substack | Article |
---|---|---|---|
Analyze A Pandas DataFrame Without Code | π | π | |
70x Faster Pandas By Changing Just One Line of Code | π | π | |
Reduce Memory Usage Of A Pandas DataFrame By 90% | π | π | Medium |
Speed-up Pandas Apply 5x with NumPy | π | π | |
A Lesser-Known Feature of Apply Method In Pandas | π | π | |
Create Pandas DataFrame from Dataclass | π | π | |
Run SQL in Jupyter To Analyze A Pandas DataFrame | π | π | |
When You Should Not Use the head() Method In Pandas | π | π | |
Three Lesser-known Tips For Reading a CSV File Using Pandas | π | π | |
The Best File Format To Store A Pandas DataFrame | π | π | Medium |
Lesser-Known Feature of the Merge Method in Pandas | π | π | |
The Best Way to Use Apply() in Pandas | π | π | |
A No-code Tool To Understand Your Data Quickly | π | π | |
Display Progress Bar With Apply() in Pandas | π | π | |
Supercharge value_counts() Method in Pandas With Sidetable | π | π | |
Explore CSV Data Right From The Terminal | π | π | |
Define the Correct DataType for Categorical Columns | π | π | Medium |
Don't Create Conditional Columns in Pandas with Apply | π | π | |
Write Your Own Flavor Of Pandas | π | π | |
Create DataFrame Hassle-free By Using Clipboard | π | π | |
Alter the Datatype of Multiple Columns at Once | π | π | |
Why you should not dump DataFrames to a CSV | π | π | Medium |
Why You Should Not Read CSVs with Pandas | π | π | Medium |
Parallelize Pandas Apply() With Swifter | π | π | |
A Hidden Feature of Describe Method In Pandas | π | π | |
Enrich Your Notebook With Interactive Controls | π | π | |
Data Analysis Using No-Code Pandas In Jupyter | π | π | |
Create Pivot Tables, Aggregations and Plots Without Any Code | π | π | Medium |
Parallelize Pandas with Pandarallel | π | π | Medium |
Pretty Plotting With Pandas | π | π | |
How to Read Multiple CSV Files Efficiently | π | π | Medium |
Configure Sklearn To Output Pandas DataFrame | π | π | |
Datatype For Handling Missing Valued Columns in Pandas | π | π | Medium |
Vectorization Does Not Always Guarantee Better Performance | π | π |
Title | Notebook | Substack | Article |
---|---|---|---|
Stop Previewing Raw DataFrames. Instead, Use DataTables. | π | π | |
Label Your Data With The Click Of A Button | π | π | |
The Coolest Jupyter Notebook Hack | π | π | |
View Documentation in Jupyter Notebook | π | π | |
Get Notified When Jupyter Cell Has Executed | π | π | |
Clear Cell Output In Jupyter Notebook During Run-time | π | π | |
CodeSquire: The AI Coding Assistant You Should Use Over GitHub Copilot | π | π | |
Find Your Code Hiding In Some Jupyter Notebook With Ease | π | π | |
Enrich Your Notebook With Interactive Controls | π | π | |
Data Analysis Using No-Code Pandas In Jupyter | π | π | |
Create Pivot Tables, Aggregations and Plots Without Any Code | π | π | Medium |
Restart Notebook Without Losing Variables | π | π | Medium |
Retrieve Previously Computed Output In Jupyter Notebook | π | π | Medium |
Transfer Variables Between Jupyter Notebooks | π | π | Medium |
Title | Notebook | Substack | Article |
---|---|---|---|
A Single Line That Will Make Your Python Code Faster | π | π | |
Make Dot Notation More Powerful in Python | π | π | |
An Elegant Way To Perform Shutdown Tasks in Python | π | π | |
What Are Class Methods and When To Use Them? | π | π | |
Hide Attributes While Printing A Dataclass Object | π | π | |
List : Tuple :: Set : ? | π | π | |
Post_init: Add Attributes To A Dataclass Post Initialization | π | π | |
Simplify Your Functions With Partial Functions | π | π | |
DotMap: A Better Alternative to Python Dictionary | π | π | |
Prevent Wild Imports With all in Python | π | π | |
Performance Comparison of Python 3.11 and Python 3.10 | π | π | |
Why 256 is 256 But 257 is not 257? | π | π | |
Make a Class Object Behave Like a Function | π | π | |
Lesser-known Feature of Pickle Files | π | π | |
Specify Loops and Runs In %%timeit | π | π | |
Don't Use time.time() To Measure Execution Time | π | π | |
Import Your Python Package as a Module | π | π | |
Fine-grained Error Tracking With Python 3.11 | π | π | |
Run Python Project Directory As A Script | π | π | |
Use Slotted Class To Improve Your Python Code | π | π | |
Using Dictionaries In Place of If-conditions | π | π | |
In Defense of Match-case Statements in Python | π | π |
Title | Notebook | Substack | Article |
---|---|---|---|
Perfplot: Measure, Visualize and Compare Run-time With Ease | π | π | |
Prettify Word Clouds In Python | π | π | |
Calendar Map As A Richer Alternative to Line Plot | π | π | |
Density Plot As A Richer Alternative to Scatter Plot | π | π | Medium |
Python One-Liner To Create Sketchy Hand-drawn Plots | π | π | |
Create a Moving Bubbles Chart in Python | π | π | |
Visualizing Google Search Trends of 2022 using Python | π | π | |
Create A Racing Bar Chart In Python | π | π | |
Elegantly Plot the Decision Boundary of a Classifier | π | π | |
Dot Plot: A Potential Alternative to Bar Plot | π | π | Medium |
Hexbin Plots As A Richer Alternative to Scatter Plots | π | π | Medium |
Enrich Your Notebook With Interactive Controls | π | π | |
Regression Plot Made Easy with Plotly | π | π | |
Pretty Plotting With Pandas | π | π | |
Polynomial Linear Regression Plot Made Easy With Seaborn | π | π | |
Analyse Flow Data With Sankey Diagrams | π | π | |
Waterfall Charts: A Better Alternative to Line/Bar Plot | π | π | Medium |
Title | Notebook | Substack | Article |
---|---|---|---|
Speed-up NumPy 20x with Numexpr | π | π | |
An Elegant Way To Perform Matrix Multiplication | π | π | |
Difference Between Dot and Matmul in NumPy | π | π | |
Don't Print NumPy Arrays! Use Lovely-NumPy Instead | π | π | |
Polynomial Linear Regression with NumPy | π | π |
Title | Notebook | Substack | Article |
---|---|---|---|
70x Faster Pandas By Changing Just One Line of Code | π | π | |
Reduce Memory Usage Of A Pandas DataFrame By 90% | π | π | Medium |
The Best File Format To Store A Pandas DataFrame | π | π | Medium |
Define the Correct DataType for Categorical Columns | π | π | Medium |
Datatype For Handling Missing Valued Columns in Pandas | π | π | Medium |
Save Memory with Python Generators | π | π |
Title | Notebook | Substack | Article |
---|---|---|---|
Preview Your README File Locally In GitHub Style | π | π | |
This GUI Tool Can Possibly Save You Hours Of Manual Work | π | π | |
Stop Previewing Raw DataFrames. Instead, Use DataTables. | π | π | |
Converting Python To LaTeX Has Possibly Never Been So Simple | π | π | |
Label Your Data With The Click Of A Button | π | π | |
Analyze A Pandas DataFrame Without Code | π | π | |
A No-Code Online Tool To Explore and Understand Neural Networks | π | π | |
Speed-up NumPy 20x with Numexpr | π | π | |
Debugging Made Easy With PySnooper | π | π | |
Deep Learning Network Debugging Made Easy | π | π | |
CodeSquire: The AI Coding Assistant You Should Use Over GitHub Copilot | π | π | |
Find Unused Python Code With Ease | π | π | |
Enrich Your Notebook With Interactive Controls | π | π | |
Data Analysis Using No-Code Pandas In Jupyter | π | π | |
Modify Python Code During Run-Time | π | π | Medium |
Modify Function During Run-Time | π | π | Medium |
Importing Modules Made Easy with Pyforest | π | π | |
Create Pivot Tables, Aggregations and Plots Without Any Code | π | π | Medium |
Title | Notebook | Substack | Article |
---|---|---|---|
A Single Line That Will Make Your Python Code Faster | π | π | |
Make Sklearn KMeans 20x times faster | π | π | |
Speed-up NumPy 20x with Numexpr | π | π | |
The Best File Format To Store A Pandas DataFrame | π | π | Medium |
The Best Way to Use Apply() in Pandas | π | π | |
Don't Create Conditional Columns in Pandas with Apply | π | π | |
Why you should not dump DataFrames to a CSV | π | π | Medium |
Parallelize Pandas Apply() With Swifter | π | π | |
Parallelize Pandas with Pandarallel | π | π | Medium |
How to Read Multiple CSV Files Efficiently | π | π | Medium |
Title | Notebook | Substack | Article |
---|---|---|---|
Sklearn One-liner to Generate Synthetic Data | π | π | |
Skorch: Use Scikit-learn API on PyTorch Models | π | π | |
Make Sklearn KMeans 20x times faster | π | π | |
Build Baseline Models Effortlessly With Sklearn | π | π | |
Polynomial Linear Regression with NumPy | π | π | |
An Elegant Way to Import Metrics From Sklearn | π | π | |
Feature Tracking Made Simple In Sklearn Transformers | π | π | |
Configure Sklearn To Output Pandas DataFrame | π | π |
Title | Notebook | Substack | Article |
---|---|---|---|
Debugging Made Easy With PySnooper | π | π | |
Don't use print() to debug your code. | π | π | Medium |
Inspect Program Flow with IceCream | π | π | Medium |
Lesser-known Feature of f-strings in Python | π | π |
Title | Notebook | Substack | Article |
---|---|---|---|
Handle Missing Data With Missingno | π | π | |
Datatype For Handling Missing Valued Columns in Pandas | π | π |
Title | Notebook | Substack | Article |
---|---|---|---|
Now You Can Use DALLΒ·E With OpenAI API | π | π |
Title | Notebook | Substack | Article |
---|---|---|---|
How to Encode Categorical Features With Many Categories? | π | π | |
Why KMeans May Not Be The Apt Clustering Algorithm Always | π | π | |
Skorch: Use Scikit-learn API on PyTorch Models | π | π | |
A No-Code Online Tool To Explore and Understand Neural Networks | π | π | |
Make Sklearn KMeans 20x times faster | π | π | |
Deep Learning Network Debugging Made Easy | π | π | |
Build Baseline Models Effortlessly With Sklearn | π | π | |
Polynomial Linear Regression with NumPy | π | π |
Title | Notebook | Substack | Article |
---|---|---|---|
Pandas and NumPy Return Different Values for Standard Deviation. Why? | π | π | |
Why Correlation (and Other Statistics) Can Be Misleading | π | π |
Title | Notebook | Substack | Article |
---|---|---|---|
Generate Your Own Fake Data In Seconds | π | π |
Title | Notebook | Substack | Article |
---|---|---|---|
Visualize Commit History of Git Repo With Beautiful Animations | π | π | |
How Would You Identify Fuzzy Duplicates In A Data With Million Records? | π | π | |
Automated Code Refactoring With Sourcery | π | π | Medium |
Explore CSV Data Right From The Terminal | π | π |
Title | Document | Substack | Article |
---|---|---|---|
10 Automated EDA Tools That Will Save You Hours Of (Tedious) Work | π | π | |
30 Python Libraries to (Hugely) Boost Your Data Science Productivity | π | π |