🎯 Employee Promotion Prediction System

📄 Project Overview

This project focuses on building a Machine Learning Model to predict whether employees are eligible for a promotion based on their performance, achievements, and other attributes. The solution is designed to help HR departments make data-driven decisions while ensuring fairness and efficiency.

🔍 Objectives

Develop a robust ML model to predict employee promotions.
Handle class imbalance to ensure accurate predictions for underrepresented groups.
Provide a detailed EDA (Exploratory Data Analysis) to understand key patterns and relationships in the data.
Train models with optimized features for better interpretability and performance.

🛠️ Technologies and Tools Used

Programming Language: Python 🐍
Libraries:
- Data Processing: pandas, numpy
- Visualization: matplotlib, seaborn
- Machine Learning: scikit-learn, xgboost
- Oversampling: imbalanced-learn
- Model Persistence: joblib

📂 Project Structure

EDA-and-predictor-for-Employee-Eligible-for-Promotion/
├── data/                     # Contains datasets
│   ├── HRData.csv            # Raw dataset
│   ├── HRData_cleaned.csv    # Cleaned and preprocessed dataset
│   ├── hrdatatest.csv        # Test dataset
│   └── predictions_hrdatatest.csv # Predictions output
├── models/                   # Trained models and scalers
│   ├── model.py
│   ├── xgboost_model_smote.pkl
│   └── scaler.pkl
├── scripts/                  # Scripts for each step of the pipeline
│   ├── data_preprocessing.py # Data preprocessing and cleaning
│   ├── Exploratory_Data_Analysis.py  # Exploratory Data Analysis
│   ├── predict.py            # Predictions on new data
├── README.md                 # Project documentation
└── requirements.txt          # Python dependencies

🚀 Key Features

🔹 1. Exploratory Data Analysis (EDA)

Visualizations included:
- Distribution of promotions.
- Correlation matrix.
- Boxplots (e.g., age vs promotion).
- Histograms (e.g., training scores).
- Promotion breakdown by department.
Insights:
- avg_training_score, KPIs_met >80%, and previous_year_rating are highly correlated with promotions.

🔹 2. Data Preprocessing

Handled missing values using SimpleImputer.
Converted categorical variables to numerical using LabelEncoder.
Standardized features using StandardScaler.

🔹 3. Model Training

Used XGBoost for classification.
Addressed class imbalance with SMOTE.
Trained the model with the full feature set and improved its precision and recall by balancing classes using SMOTE:

🔹 4. Predictions

Deployed a script to predict promotions on new datasets.
Outputs predictions in predictions_hrdatatest.csv.

📊 Model Metrics

Model with Full Features

Accuracy: 97%.
Class 1 (Promoted):
- Recall: 94%.
- Precision: 99%.
- F1-Score: 96%.

⚙️ How to Run the Project

Set up the environment:

python -m venv venv
source venv/bin/activate  # On Windows: .\venv\Scripts\activate
pip install -r requirements.txt

Preprocess the data:
```
python scripts/data_preprocessing.py
```

Perform EDA:

python scripts/Exploratory_Data_Analysis.py

Train the model:
```
python models/model_py
```
Make predictions:
```
python scripts/predict.py
```

📦 Dependencies

Install all required libraries with:

pip install -r requirements.txt

📥 Outputs

Predictions are saved in the data/predictions_hrdatatest.csv file.
Trained models and scalers are saved in the models/ directory.

🤝 Contributing

Contributions are welcome! Feel free to submit a pull request or raise an issue. Let's make this project even better!

📧 Contact

For questions or feedback, please reach out to me at - LinkedIn Profile.

Name		Name	Last commit message	Last commit date
Latest commit History 114 Commits
App		App
Data		Data
models		models
scripts		scripts
visualizations		visualizations
.gitignore		.gitignore
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

🎯 Employee Promotion Prediction System

📄 Project Overview

🔍 Objectives

🛠️ Technologies and Tools Used

📂 Project Structure

🚀 Key Features

🔹 1. Exploratory Data Analysis (EDA)

🔹 2. Data Preprocessing

🔹 3. Model Training

🔹 4. Predictions

📊 Model Metrics

Model with Full Features

⚙️ How to Run the Project

📦 Dependencies

📥 Outputs

🤝 Contributing

📧 Contact

✨ Thank you for exploring this project! 😊

About

Releases

Packages

Languages

Bautistao2/Employee-Promotion-Prediction-System.Data-Driven-Insights-for-HR-Decisions.

Folders and files

Latest commit

History

Repository files navigation

🎯 Employee Promotion Prediction System

📄 Project Overview

🔍 Objectives

🛠️ Technologies and Tools Used

📂 Project Structure

🚀 Key Features

🔹 1. Exploratory Data Analysis (EDA)

🔹 2. Data Preprocessing

🔹 3. Model Training

🔹 4. Predictions

📊 Model Metrics

Model with Full Features

⚙️ How to Run the Project

📦 Dependencies

📥 Outputs

🤝 Contributing

📧 Contact

✨ Thank you for exploring this project! 😊

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages