Feature Engineering-Based Credit Card Fraud Detection for Risk Minimization in E-Commerce
- Data Mining
- Sequential Feature Selection
- Comparative Analysis
Presented at the 4th INTERNATIONAL CONFERENCE ON INTELLIGENT COMPUTING & OPTIMIZATION 2021. Available Online at Lecture Notes in Networks and Systems, Springer
In today’s financial business, financial fraud is a rising concern with far-reaching repercussions, and data mining has a crucial role in identifying fraudulent transactions. However, fraud detection in a credit card can be challenging because of significant reasons, such as normal and fraudulent behaviour of the profiles change frequently, scarcity of fraudulent data, dataset being highly imbalanced, and so on. Besides, the efficiency of fraud identification in online transactions is greatly impacted by the dataset sampling method and feature selection. Our study investigates the performance of five popular machine learning approaches such as Logistic Regression (LR), Random Forest (RF), Support Vector Classifier (SVC), Gradient Boosting (GBC), and K-Nearest Neighbors (KNN) in terms of feature selection. Feature selection is done by Sequential Forward Selection in addition to extending the models’ performance by handing imbalanced data using Random Undersampling and feature scaling using PCA transformation & RobustScalar for both numerical and categorical data. Finally, the performance of different machine learning techniques is assessed based on accuracy, precision, recall, and F1-measure on a benchmark credit card dataset.