Research On Unbalanced Sample Processing Under Corporate Financial Distress Predictio

Posted on:2024-01-16

Degree:Master

Type:Thesis

Country:China

Candidate:T Li

Full Text:PDF

GTID:2569307133495394

Subject:Applied statistics

Abstract/Summary:

PDF Full Text Request

Bonds are one of the most effective means of direct financing for enterprises,and the continuous development of this market has enriched and improved China’s financial system at multiple levels.However,at the same time,the problem of defaults in the bond market in which companies are unable to pay principal and interest on time due to financial difficulties has been common in recent years,which is a huge hidden danger for companies to prevent financial crises,for investors and creditors to protect their capital rights,or for government departments to effectively supervise the normal operation of the capital market.Therefore,it is of great practical significance to accurately and effectively predict a company’s financial distress.In addition,the reform of China’s market economy system continues to deepen,the capital market develops rapidly,and the demand for corporate financial distress prediction research from all parties in society has become increasingly urgent,and it is urgent to step up the combination of artificial intelligence and traditional business,and establish an effective financial status early warning mechanism through a more perfect forecasting method.This thesis considers the unique factors of China’s national conditions,constructs a prediction index system in line with China’s national conditions,and focuses on the problem of unbalanced sample processing under financial difficulties based on the current cutting-edge data mining technology and machine learning method theory,and proposes an early warning model of corporate financial distress with high prediction accuracy,quantifiability and certain explainability.The main research contents are as follows:First,based on the background of China’s special national conditions,this thesis uses domestic bond market data from 2014 to 2021,takes the results of existing domestic and foreign companies’ financial distress prediction research results as a reference,starts from the four major determinants of accounting information,market information,macro information and other information,and considers the compatibility between data,models and variable indicators,summarizes the specific index variables that measure the company’s ability in all aspects as a reference for establishing a system,and conducts ANOVA analysis on these variables.Correlation analysis and VIF test and other screening to eliminate conflicting variables or incompatible with data and models,establish a relatively comprehensive,effective and reasonable index system,and detect outliers in the data through the robROSE algorithm to construct the company’s financial data set.Second,when dealing with the imbalanced sample problem,this thesis deals with balanced data sets without regional information features processed by traditional SMOTE algorithm and GAN algorithm,and balances data sets with regional information features processed by Borderline-SMOTE algorithm and BMW-SMOTE algorithm,respectively.Among them,the group containing regional information characteristics performed significantly better than the other group.Third,this thesis takes the unprocessed dataset,the balanced dataset processed by the traditional SMOTE algorithm,GAN algorithm,Borderline-SMOTE algorithm and BMWSMOTE algorithm as the data basis for establishing the classification model,and constructs a single learner classification model-logistic regression model(LR),artificial neural network model(ANN)and support vector machine(SVM),and an ensemble learning algorithm model of multiple learners-random forest model(RF),Conventional gradient boosting algorithm model(XGBoost),gradient boosting decision tree model(GBDT),and adaptive boosting algorithm model(AdaBoost).The classification performance of the model was evaluated by selecting F-measure,G-mean and AUC values,and the results showed that the regional information feature had a gain effect on the imbalanced sample problem,and the establishment of the financial distress early warning model with the ensemble learning algorithm brought performance advantages.

Keywords/Search Tags:

financial distress forecast, unbalanced samples, data part, integrated learning

PDF Full Text Request

Related items

1	Financial Distress Prediction Model Based On RS-SVM-data Mining Technology
2	Research On Classification Of Unbalanced Financial Data Based On Ensemble Learning
3	Research On Risk Prediction Of Unbalanced Financial Data Based On Ensemble Learning
4	The Study On Financial Distress Prediction Of Listed Manufacture Companies Based On Non-paired Samples
5	Research On Employee Turnover Prediction Based On SMOTE-SVM Under Unbalanced Dat
6	Comparative Analysis Of Unbalanced Data Classification Methods In The Field Of Financial Prediction
7	Prediction Of Financial Distress Of A-share Manufacturing Companies Based On Integrated Learning
8	Dynamic Prediction Of Financial Distress Based On Imbalanced Data Stream Of An Industry
9	Research On The Early Warning Model Of Financial Distress Of Listed Companies Based On Machine Learning
10	The Research On Electricity Larceny Prevention With Data Mining In AMI