This repository contains Python code for visualizing the Bank Marketing dataset using various data visualization techniques. The dataset is loaded from a CSV file, and both numerical and categorical features are explored using popular libraries such as Pandas, Matplotlib, Seaborn, and Plotly.
The Bank Marketing dataset is loaded and analyzed through various data visualization techniques. This includes visualizing the distribution of both categorical and numerical features, exploring relationships between features and labels, and performing multivariate analysis.
-
Clone the repository:
git clone https://github.com/ishika-mi/DataVisualization-Banking.git cd DataVisualization-Banking
-
Install the required libraries by running the following command:
pip install pandas numpy matplotlib seaborn plotly
-
Run the Jupyter notebook or Python script to execute the visualizations.
- age: Represents the age of the individual.
- job: Describes the occupation or job of the person.
- marital: Indicates the marital status of the person (e.g., married, single, divorced).
- education: Represents the educational level of the person (e.g., primary, secondary, tertiary).
- default: Indicates whether the person has credit in default ('yes', 'no', or 'unknown').
- housing: Shows whether the person has a housing loan ('yes', 'no', or 'unknown').
- loan: Indicates whether the person has a personal loan ('yes', 'no', or 'unknown').
- contact: Describes the method of communication used to contact the person (e.g., 'cellular', 'telephone').
- day: Indicates the day of the week of the last contact.
- month: Represents the month of the last contact.
- Duration: Represents the duration of the last contact in seconds
- campaign: Indicates the number of contacts made during this campaign.
- pdays: Describes the number of days since the person was last contacted or -1 if they were not previously contacted.
- previous: Represents the number of contacts made before this campaign.
- poutcome: Indicates the outcome of the previous marketing campaign.
- deposit: The target variable, indicating whether the person subscribed to a term deposit ('yes' or 'no')gn.
-
Categorical Feature Distributions: Visualizes the distribution of categorical features using count plots and frequency polygons.
-
Frequency Polygon: Plots frequency polygons for each categorical feature to show the distribution of values.
-
Relationship between Categorical Features and Label: Uses catplots to explore the relationship between categorical features and the label ('deposit').
-
Numerical Feature Distributions: Visualizes the distribution of numerical features using histograms, density plots, violin plots, bar graphs, box plots, and scatter plots.
-
Heatmap: Displays the correlation heatmap for numerical features.
-
Box Plot: Plots box and whisker plots for numerical features.
-
Multivariate Analysis: Includes scatter plots, parallel coordinate plots, and contour plots for multivariate analysis.