Font Size: a A A

Prediction Of Shoppers' Purchasing Intention Based On Stacking Ensemble Classification

Posted on:2022-01-16Degree:MasterType:Thesis
Country:ChinaCandidate:Q L XingFull Text:PDF
GTID:2517306509489104Subject:Applied Statistics
Abstract/Summary:PDF Full Text Request
In recent years,network technology has been greatly improved,and the development of online shopping is in full swing.Online shopping provides great convenience for people to buy goods.Since online shopping malls have far more product categories than physical shopping malls,this gives people more choices when buying goods.The purchase tendency of customers is a major factor in determining whether they will eventually buy the product.Therefore,understanding the customer's purchase tendency is one of the main strategies of major businesses to increase sales.Through in-depth understanding of customers,merchants can recommend more suitable products to customers,thereby increasing operating income,and customers can also get more humane services and better shopping experience.The data set used in this article comes from the UCI database and contains session information of customers browsing shopping web pages.A variety of classification algorithms are used to predict whether customers have a buying tendency,including support vector machines,decision trees,random forests,XGBoost,Light GBM,Cat Boost and model integration method stacking algorithm.In order to ensure the reliability of the experimental results,a total of ten experiments are conducted and the data set is divided ten times.In each experiment,the data set is divided into training set and test set according to the ratio of 7:3.On the training set,the parameters of each model are gradually adjusted based on the four evaluation indicators of model cross-validation,include F1,Gmean,Accuracy,and AUC.Then three models include XGBoost,Light GBM,Cat Boost are chose to build the stacking model,and the evaluation indicators of each model on the currently divided test set is calculated.Because the data set is unbalanced and contains continuous variables and discrete categorical variables,the SMOTENC method is used for model training and parameter adjustment.Finally,the experimental results of ten experiments are summarized,the average value is calculated and related analysis is conducted.The experimental results show that Cat Boost has the highest F1 and TNR,indicating that the model can recognize both types of customers relatively well and can identify more customers with no purchase intention.Through the integration of multiple strong models,the prediction results of the classification model have been further improved.The stacking model has the highest AUC,Gmean and TPR,which can effectively classify customers and identify more customers who are prone to purchase.It has high reference value in practical application.
Keywords/Search Tags:Classification, SMOTENC, Stacking
PDF Full Text Request
Related items