| Health insurance is a key component of the medical insurance system in China.With the improvement of people’s awareness on health,demand for health insurance is gradually increasing.However,there is a large gap between the density and depth of Chinese health insurance and the foreign health insurance markets that are much mature.How to increase the premium of health insurance has become an urgent problem in China’s health insurance market.Based on the literature and using the micro data of the China Family Panel Studies(CFPS)in 2018,a set of indicators of features from individual and family characteristics,social features and subjective psychological performances were selected as variables affecting the demand for health insurance.The following research work has been carried out:(1)data preprocessing,feature engineering and sample analysis;(2)The support vector regression SVR model,BP neural network model and LightGBM prediction model are constructed and evaluated,and five-fold Cross-validation is performed on the data set;(3)Particle swarm optimization algorithm and SHAP method are used to optimize the machine learning model;(4)The SHAP method and LIME method are used to explain the optimal model;(5)Deep learning models are used for further predictive analysis.The results show that:(1)the performance of the machine learning model optimized by particle swarm optimization is higher than that of the model before optimization;(2)The LightGBM model optimized by particle swarm optimization algorithm is superior to the other five models in the five evaluation indicators in this thesis;(3)The interpretive analysis of the optimal model by SHAP and LIME methods can improve the reliability of the prediction results and promote the understanding of the model users;(4)Compared with LIME method,SHAP method is more suitable for explaining the optimal model in this thesis;(5)Combining Particle Swarm Optimization(PSO)algorithm,prediction model,and SHAP method can improve the performance of the model;(6)The convolutional neural network model has better generalization.Based on the above research results,the theoretical significance of this study is to select household micro-data to study the problem of health insurance coverage from a comprehensive perspective,and finally build the PSO-LightGBM-SHAP model that is most suitable for predicting the health insurance coverage,and at the same time provide a visual way to enhance the comprehensibility of the model and the reliability of the prediction results;The practical value of this study is to put forward relevant suggestions on the development of health insurance from the perspective of the government and insurance companies,in order to provide more sufficient theoretical basis and empirical support for the effective implementation of health insurance. |