Font Size: a A A

Research On Credit Card Fraud Detection Based On XGBoost Algorithm

Posted on:2021-07-16Degree:MasterType:Thesis
Country:ChinaCandidate:N N QianFull Text:PDF
GTID:2558306917481784Subject:Applied Statistics
Abstract/Summary:PDF Full Text Request
In recent years,due to the continuous development of machine learning technology,the XGBoost algorithm has been increasingly used in credit card fraud detection,and has achieved good results.Due to the imbalance of credit card data,the credit card fraud detection system needs to be improved to reduce the losses of credit card users and financial institutions.Credit card fraud detection still has two major problems.One is the problem of data imbalance,and the other is the concept of data drift.Aiming at the problem of the imbalanced distribution of categories in credit card fraud transactions,this paper improves on the simple combination of XGBoost algorithm and Borderline SMOTE method.The main idea is to combine XGBoost algorithm’s strong binary classification ability and Borderline SMOTE algorithm generalization ability Strong advantage,using AUC as the main evaluation index,resampling the data from multiple rounds of resampling to train a classifier,and build a classification prediction model based on this model.It is expected to improve the prediction performance of the model.In addition,this paper also improves the algorithm.Since the cost of misjudgment is different in credit card fraud detection,this paper uses a cost-sensitive loss function instead of the original loss function for training.The research in this paper is based on a dataset of 284,807 transaction records,including the time from the first transaction to each transaction,the amount of each transaction,and 28 features after PCA processing.This paper preprocesses the data,cleans the data,and then uses random forest algorithm,Adaboost algorithm,and XGBoost algorithm to select features.Because the data set used in this article is an imbalanced data set,the imbalance of the categories will make the prediction results inaccurate.Therefore,this article uses the Adaboost algorithm,XGBoost algorithm,and S VM algorithm to perform fraud transaction detection.The three methods of SMOTE are processed and compared with the unbalanced data detection effect to find the detection model that is most suitable for the data in this paper.Then,based on the simple combination of Borderline SMOTE and XGBoost algorithm,the data processing method is improved,and the loss function of the algorithm is improved.
Keywords/Search Tags:XGBoost, credit card fraud, imbalance data, sencitive cost learning
PDF Full Text Request
Related items