This case requires to develop a model for predicting fraudulent transactions for a financial company and use insights from the model to develop an actionable plan
- There is no null values in any columns
- Most of the datapoints are not correlated
- Oldbalance is correlated with newbalance of person who intitiated transaction
- newbalanceDest and oldbalanceDest is also correlated.
Fraud Detection Model contains following steps:
- Loading of dataset and Explore it
- Checking the Distribution of transaction datatype
- Checking for the need of data cleaning
- statistical Analysis of Data
- Deal with the skewed data with boxcox tranformation and log transformation
- Taking features to train the data
- Train test split with 80-20 rule
- use Sampling to balance the dataset.
- Fit the model to RandomForestClassifier and Logistic Regression
- Analyse the results
'isFlaggedFraud', 'oldbalanceOrg', 'type_CASH_OUT', 'type_TRANSFER' are fraud causing/predicting features.
Yes, these factors absolutely makes sense.
'isFlaggedFraud' - column which marks transaction as a fraud or not
'oldbalanceOrg' - indicates that customer/individual with more balance in his account is prone to fraud transaction
'Cash-Out' - it refers to convert non-cash asset into Cash. Thus online selling of large goods etc is more prone to frauds.
'Transfer' - Uninformed transfers mode have found to be modes through which fraud takes place.
- Can implement multiple account to distribute the huge balance.
- Sperate account for online transactions.
- Special mode for cash out transactions.
- Special regestering of account users for transfer modes.
Accuracy is used for measuring accuracy.
Only accuracy cannot be relied as in Fraud detection problem, recall and precision is more important than other metrics.
We Get highest Precision in Logistic Regression
highest Accuracy in Random Forest Classifier
highest Recall in Random Forest Classifier
highest F1 Score in Random Forest Classifier
By this I concludes Random Forest Classifier is best model for this fraud transaction detection