Welcome to my CodSoft Data Science Projects repository. This collection contains a series of Jupyter notebooks demonstrating various data science projects I've worked on during my internship at CodSoft. Below, you'll find a brief overview of each project included in this repository.
This project involves predicting the survival of passengers on the Titanic. We use machine learning algorithms to analyze the Titanic dataset and predict whether a passenger survived or not based on features such as age, sex, class, and more.
- Dataset: Titanic - Machine Learning from Disaster
- Techniques: Data cleaning, Exploratory Data Analysis (EDA), Feature Engineering, Logistic Regression, Decision Trees, Random Forest
- Libraries: Pandas, NumPy, Matplotlib, Seaborn, Scikit-learn
In this project, we aim to detect fraudulent credit card transactions. The dataset used contains transactions made by credit cards in September 2013 by European cardholders, where we apply machine learning models to identify fraud.
- Dataset: Credit Card Fraud Detection dataset
- Techniques: Data preprocessing, Resampling (SMOTE), Feature Scaling, Classification Algorithms (Logistic Regression, Random Forest, Gradient Boosting)
- Libraries: Pandas, NumPy, Matplotlib, Seaborn, Scikit-learn, Imbalanced-learn
This project focuses on building a recommendation system for movies. Using collaborative filtering and content-based filtering techniques, we develop a system that recommends movies to users based on their past preferences and similarities between movies.
- Dataset: MovieLens dataset
- Techniques: Collaborative Filtering, Content-Based Filtering, Singular Value Decomposition (SVD)
- Libraries: Pandas, NumPy, Scikit-learn, Surprise
To run these notebooks, you will need to have the following dependencies installed:
- Python 3.x
- Jupyter Notebook
- Pandas
- NumPy
- Matplotlib
- Seaborn
- Scikit-learn
- Imbalanced-learn (for the Credit Card Fraud Detection project)
- Surprise (for the Movie Recommendation System project)
You can install the necessary libraries using pip:
pip install pandas numpy matplotlib seaborn scikit-learn imbalanced-learn surprise
Clone this repository to your local machine:
git clone https://github.com/ViHubcode/Codsoft.git
Navigate to the project directory and start the Jupyter Notebook server:
cd Codsoft
jupyter notebook
Open the desired notebook file to explore the project.
If you have suggestions for improvements or find any issues, please feel free to submit a pull request or open an issue.
This project is licensed under the MIT License. See the LICENSE file for details.