Demonstrate how MLFlow works by using the Credit Card Default Dataset
PART 1
Setup MLFlow Experiment for Manual Tuning
Create Runs for Manual Tuning Experiment (captures different parameters based on user input)
Save Experiments and Runs on local server
Save Experiments and Runs on a remote server (DagsHub)
PART 2
Setup MLFlow Experiment for HyperParameter Tuning
Create Runs for Hyperparameter Tuning ExperimentRun 1: DecisionTreeClassifier - Best Model
Run 2: DecisionTreeClassifier - Different Predictors
Run 3: DecisionTreeClassifier - Different Numerical Transformations
Run ∞: Repeat Runs using other classifier models
Decision Tree
https://archive.ics.uci.edu/static/public/350/default+of+credit+card+clients.zip
Install all requirements by running the following command
pip install -r requirements.txt
Hyperparameter Tuning:
Manual
Pipeline:NA
Model Tracking:MLFlow
Deployment:NA
├── ...
├── 01_src # Source code
│ ├── download_data.py
├── 02_data
│ ├── 01_raw # Raw data files
│ ├── 02_processed # Processed data files
│ └── 03_external # Data from external sources
├── 03_notebooks # Notebooks used for pre-processing, exploration, model training, etc
├── 03_src # Source code
├── 04_models # Trained model files, model metadata, and evaluation results
├── 05_deployment # Project deployment files
├── 06_reports # Project documentation, Jupyter Notebook reports, final reports, and presentations
├── 07_config # Configuration files for hyperparameters, data sources, logging, environment, database, and deployment
├── 08_tests # Unit tests or test scripts
├── 09_environments # Environment setup file (dependencies)
├── README.md
└── ...
If you have something to add or a new idea to implement, you are welcome to create a pull request on improvement.
- MLFlow Documentation
- Introduction to MLFlow
- Setting Up MLFlow Experiments to a Remote Server
- Kaggle Notebook
- MLFlow Reference Notebook 1
- MLFLow Reference Notebook 2
If you find this repo useful, give it a star so as many people can get to know it.