| These years,because of the increasingly fierce market competition in China and the increasing instability of internal and external management of enterprises,some enterprises began to expose their hidden problems,which finally led to the emergence of the company financial crisis.Serious financial crisis may even lead to bankruptcy and delisting of some enterprises,and then make investors,creditors and other stakeholders suffer huge losses.However,the formation of financial crisis is a gradual process,which makes it possible to identify financial crisis in advance.Since the 1930 s,the early warning of corporate financial crisis has gradually received the attention of scholars.With the passage of time and the progress of science and technology,there are more and more means to study the early warning model of corporate financial crisis,from univariate model and multivariable model to probability function model and now machine learning with the development of artificial intelligence.Machine learning has unique advantages in dealing with multi-dimensional,multi-noise and non-linear financial data.It has been applied to the financial crisis early warning model since the 1990 s.However,after combing the research status at home and abroad,it is found that most of the previous scholars focused on the optimization and prediction of a single model,and few studies on the integration of multiple learning devices to build a financial crisis early warning model,Based on this,this paper attempts to build a financial crisis early-warning model by using a stacking multi algorithm fusion model that can stack multiple heterogeneous learners.This paper takes all Chinese A-share listed companies as the research object,takes the listed companies’ shares marked ST(Special Treatment)by Shanghai and Shenzhen stock exchange for the first time as the sign of financial crisis.First of all,data preprocessing and feature filtering were carried out on the collected data,and a financial crisis early warning index system was established,including five types of financial indicators: profitability,operating capacity,development capacity,solvency and risk level,as well as three types of non-financial indicators: industry classification,internal governance and market performance,with a total of 8 aspects and 22 dimensions.In this paper,dummy variables are used to deal with the problem that data are characters.In the aspect of data balance processing,this paper adopts the SMOTETomek comprehensive sampling method,that is,the method of combining SMOTE oversampling and Tomeklink undersampling to balance the sample data.Then build the model,divide the training set and test set according to the ratio of 7:3,and use the three single algorithms of logistic regression,decision tree and support vector machine and the three integrated learning of random forest,GBDT and XGBoost to model and predict.The prediction effects of the five models are evaluated through the accuracy,accuracy,recall,F1 value,AUC value and ROC curve.Finally,stack and fuse the excellent models with stacking algorithm,observe the prediction effect,and use the characteristic importance of the first layer model to find out the important factors that may lead to the company’s financial crisis.The results show that when a single algorithm is used for modeling and prediction,the integrated learning algorithm performs well in all indicators of model evaluation,of which XGBoost is the best in terms of accuracy,accuracy,recall and F1 value,and the Random Forest is the best in terms of AUC value.Although the prediction performance of the three single algorithm models is not as good as the integrated learning model,the accuracy and other indicators are still greater than 0.8,and the prediction performance is also excellent.The decision tree,SVM,random forest,GBDT and XGBoost with excellent performance are used for stacking fusion.The prediction performance of the combined stacking model is improved by 0.0018~0.0103 compared with that of the single algorithm,which shows that the financial crisis early warning model of listed companies based on the stacking algorithm has strong feasibility.The importance of the characteristics of random forest,GBDT and XGBoost shows that the net profit rate and return on net assets in profit indicators,the growth rate of total assets in development capacity indicators and the industry classification in non-financial indicators have a great impact on the results.Investors,creditors,operators and other stakeholders should focus on these indicators when building a financial crisis early warning system. |