Font Size: a A A

Used Car Transaction Price Prediction Based On Fusion Model

Posted on:2024-02-01Degree:MasterType:Thesis
Country:ChinaCandidate:X N YangFull Text:PDF
GTID:2542307160479724Subject:Applied Statistics
Abstract/Summary:PDF Full Text Request
As our economy continues to grow,our country motor vehicle ownership and the number in vehicles to the new registration are rising,used car trading market scale has been expanded.However,there is not yet a standardized and complete assessment system for the pricing of used car,which has brought certain difficulties to the determination of used car transaction price.The current methods for determining used car transaction price are mainly the empirical prevailing market value method and the replacement cost method,which are heavily influenced by human factors.The use of machine learning and other algorithms are expected to scientifically and efficiently estimate used car transaction price and provide reference for used car pricing.This study applies a portion of the data from the 2021 MathorCup Big Data Competition,with a total of 30,000 samples and 36 raw variables.First,this study conducts data exploration analysis and pre-processing in five dimensions,including missing values,outliers,categorical variables,new features,and normalization.Subsequently,this study tries a variety of feature selection and model building methods for the prediction study of used car transaction price.In terms of feature selection,this study proposes a feature selection method based on the FI-EWM-TOPSIS framework.The framework is mainly divided into two parts:firstly,the feature importance of each feature is output using Random Forest and XGBoost,and secondly,the two sets of feature importance obtained are assigned according to the entropy weighting method,and the comprehensive scores of the variables are objectively calculated and ranked by the superior-inferior solution distance method,and the top 25 features with high scores are taken as the final set of features.A comparison experiment with the correlation coefficient method and the Lasso method for feature selection reveal that the FI-EWM-TOPSIS feature selection method has at least 3.08%,4.61%and 2.11%improvement in the evaluation indexes MAE、MSE and R~2.Moreover,the features selected by this method come from enough categories and include information from various aspects.In terms of model building,this study conducts individual model building based on six models,namely Support Vector Machine,Random Forest,XGBoost,LightGBM,CatBoost and DNN.The experimental results show that the MAE and MSE of XGBoost and CatBoost are smaller and the R~2 is larger compared with other models,and the two models have smaller prediction errors and better fitting effects.Based on XGBoost and CatBoost,this study builds the SRXLCD-X-Stacking and SRXLCD-C-Stacking fusion models using the Stacking strategy,and the X-C-Voting fusion model using the Voting strategy.The experimental results show that the SRXLCD-X-Stacking model has the smallest MAE(0.4352),MSE(2.3427)and the largest R~2(0.9908)with small fluctuations and concentrated distribution of prediction results for multiple repetitions of the experiment.For the data and methods treid in this study,the most ideal model for predicting used car transaction price is the SRXLCD-X-Stacking fusion model based on the FI-EWM-TOPSIS method.
Keywords/Search Tags:Used Car Transaction Price, Feature Importance, XGBoost, CatBoost, Model Blending
PDF Full Text Request
Related items