| At present,China’s auto insurance pricing is relatively rough,the vehicle type pricing factor is not considered,and the generalized linear model used is more limited,resulting in the fine level and low accuracy of auto insurance rate determination.Especially after the comprehensive reform of auto insurance began in September 2020,the homogenization competition in the domestic auto insurance market intensified,mainly reflected in the price war,which is also the result of insufficient refinement and differentiation in the determination of auto insurance rates.In order to solve the above problems,this paper draws lessons from the pricing methods of developed automobile insurance markets such as Europe and the United States and incorporates vehicle factors such as vehicle physical parameters and vehicle physical collision test results into the automobile insurance pricing model.Based on the vehicle physical parameters and vehicle physical collision test data,claims end,and insurance end data,The traditional generalized linear model and cutting-edge machine learning models XGBoost and LightGBM are established respectively.In the modeling process of XGBoost and LightGBM,the SHAP package is used for feature selection,and the Grid Search CV is used to optimize the parameters of XGBoost and LightGBM.The average absolute error and the coefficient of determination R~2(R-squared)are used to evaluate the model.Finally,the model pricing methods are compared and studied.The results show that the new model factor 18 parts to whole ratio,Repairability index and car weight introduced into the pricing model are the variables with higher importance ranking.These model factors have a great impact on the total claim amount of final vehicle damage insurance.In addition,the prediction accuracy and model robustness of machine learning models XGBoost and LightGBM in dealing with the total amount of vehicle damage insurance claims with multiple features and big data are higher than those of traditional vehicle insurance pricing model GLM.In the data of this study,the prediction accuracy of LightGBM is slightly higher than XGBoost;Previous studies have considered the machine learning model as a"black box",but in this study,XGBoost and LightGBM are also highly interpretable when using the shake package.The importance of factors and their impact on predictive variables can be reflected by the shake value.Previous studies also believe that machine learning algorithms occupy a large memory and have low operation efficiency.XGBoost does have the above problems,but LightGBM optimizes the above problems.LightGBM has the advantages of high efficiency and high prediction accuracy.Therefore,from all dimensions,LightGBM is obviously better than XGBoost and GLM.This study provides empirical analysis and relevant suggestions for auto insurance business to realize fine pricing of auto insurance,so as to make the rate fair and differentiated.Only when property insurance companies have the ability of finer dimensions and more accurate cost allocation than their competitors,can they really show their advantages in market competition,so as to promote the sustainable and good development of the industry,promote road traffic safety,and realize the three-way win of consumers,insurance companies and regulatory authorities. |