Healthcare Analytics: Diabetes Risk Prediction Model

Overview

An end-to-end machine learning project that predicts diabetes risk using patient health indicators. The model achieves 98.4% accuracy using Random Forest Classifier, potentially enabling early diagnosis and intervention.

Dataset

Source: National Institute of Diabetes, Digestive and Kidney Diseases (via Kaggle)
Size: 2,768 patient records
Features: 8 health-related numerical attributes
Target: Binary classification (0: Healthy, 1: Diabetic)

Project Structure

├── EDA
│   ├── EDA (Exploratory data analysis).ipynb
│   └── Healthcare-Diabetes.csv
└── Native Bayes and Random Forest
    ├── Healthcare-Diabetes.csv
    └── models.ipynb

Methodology

Data Preprocessing
- Outlier detection and removal
- Feature selection based on correlation analysis
- No missing values or duplicates found
Exploratory Data Analysis
- Distribution analysis of health indicators
- Correlation analysis between variables
- Feature importance evaluation
Model Development
- Implemented Bernoulli Naive Bayes and Random Forest classifiers
- Model comparison and evaluation using confusion matrices
- Random Forest achieved superior performance

Technologies Used

Python (Pandas, NumPy)
Scikit-learn
Seaborn/Matplotlib

Results

Random Forest Classifier: 98.4% accuracy
Successfully identified key health indicators for diabetes prediction
Created visualization tools for healthcare provider decision support

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
EDA		EDA
Native_Bayes_and_Random_Forest		Native_Bayes_and_Random_Forest
README.md		README.md
Reporting_Application_of_Decision_Tree_Predictor.pdf		Reporting_Application_of_Decision_Tree_Predictor.pdf

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Healthcare Analytics: Diabetes Risk Prediction Model

Overview

Dataset

Project Structure

Methodology

Technologies Used

Results

About

Releases

Packages

Languages

zoeyai1221/Healthcare_Diabetes_Prediction_Model_ML

Folders and files

Latest commit

History

Repository files navigation

Healthcare Analytics: Diabetes Risk Prediction Model

Overview

Dataset

Project Structure

Methodology

Technologies Used

Results

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages