Skip to content

This case requires to develop a model for predicting fraudulent transactions for a financial company and use insights from the model to develop an actionable plan

Notifications You must be signed in to change notification settings

DG492003/Fraudulent_transactions_Detection

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

7 Commits
 
 
 
 

Repository files navigation

Fraudulent_transactions_Detection

This case requires to develop a model for predicting fraudulent transactions for a financial company and use insights from the model to develop an actionable plan

1 . Data cleaning including missing values, outliers and multi-collinearity.

  • There is no null values in any columns
  • Most of the datapoints are not correlated
  • Oldbalance is correlated with newbalance of person who intitiated transaction
  • newbalanceDest and oldbalanceDest is also correlated.

2. Describe your fraud detection model in elaboration.

Fraud Detection Model contains following steps:

  • Loading of dataset and Explore it
  • Checking the Distribution of transaction datatype
  • Checking for the need of data cleaning
  • statistical Analysis of Data
  • Deal with the skewed data with boxcox tranformation and log transformation
  • Taking features to train the data
  • Train test split with 80-20 rule
  • use Sampling to balance the dataset.
  • Fit the model to RandomForestClassifier and Logistic Regression
  • Analyse the results

3. What are the key factors that predict fraudulent customer?

'isFlaggedFraud', 'oldbalanceOrg', 'type_CASH_OUT', 'type_TRANSFER' are fraud causing/predicting features.

4. Do these factors make sense? If yes, How? If not, How not?

Yes, these factors absolutely makes sense.

  • 'isFlaggedFraud' - column which marks transaction as a fraud or not

  • 'oldbalanceOrg' - indicates that customer/individual with more balance in his account is prone to fraud transaction

  • 'Cash-Out' - it refers to convert non-cash asset into Cash. Thus online selling of large goods etc is more prone to frauds.

  • 'Transfer' - Uninformed transfers mode have found to be modes through which fraud takes place.

    5. What kind of prevention should be adopted while company update its infrastructure?

    • Can implement multiple account to distribute the huge balance.
    • Sperate account for online transactions.
    • Special mode for cash out transactions.
    • Special regestering of account users for transfer modes.

    6. Demonstrate the performance of the model by using best set of tools

  • Accuracy is used for measuring accuracy.

  • Only accuracy cannot be relied as in Fraud detection problem, recall and precision is more important than other metrics.

  • We Get highest Precision in Logistic Regression

  • highest Accuracy in Random Forest Classifier

  • highest Recall in Random Forest Classifier

  • highest F1 Score in Random Forest Classifier

By this I concludes Random Forest Classifier is best model for this fraud transaction detection

About

This case requires to develop a model for predicting fraudulent transactions for a financial company and use insights from the model to develop an actionable plan

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published