Skip to content

Shreyjain203/Customer-churn-analysis

 
 

Repository files navigation

Telecom Churn Prediction with Logistic Regression

This repository analyzes telecommunication customer churn using a Logistic Regression model built on data from Kaggle. The project focuses on:

  • Data Exploration and Cleaning: Cleaning, visualizing, and understanding the churn dataset.
  • Feature Engineering: Creating and transforming features for better model performance.
  • Multicollinearity Analysis: Identifying and addressing highly correlated features to improve model robustness.
  • Influential Points Detection: Examining data points potentially impacting model predictions.
  • Logistic Regression Model Development: Training and evaluating a Logistic Regression model for predicting customer churn.
Key Points:

  • Data Source: Kaggle's telecommunication churn dataset.
  • Model Accuracy: 81% accuracy on the test set.
  • Methodology:
    • Exploratory Data Analysis (EDA)
    • Feature Engineering
    • Multicollinearity Analysis (e.g., gvif)
    • Influential Points Detection
    • Logistic Regression Model Training and Evaluation
Contribution:

Feel free to fork and contribute to this project. Any insights or improvements are welcome!

Disclaimer: This is a basic implementation for educational purposes only. The code and approach may require further refinement for real-world deployment.

Releases

No releases published

Packages

No packages published

Languages

  • Jupyter Notebook 100.0%