Font Size: a A A

Research On Classification Of Unbalanced Financial Data Based On Ensemble Learning

Posted on:2021-03-26Degree:MasterType:Thesis
Country:ChinaCandidate:S Q HaoFull Text:PDF
GTID:2439330620463401Subject:Applied Statistics
Abstract/Summary:PDF Full Text Request
With the rapid development of economic globalization,market competition is becoming more and more fierce,which brings development and opportunities to enterprises,but also brings a lot of risks and challenges.Timely discovery of possible financial risks of the enterprise itself can not only prevent the interests in stakeholders from being damaged,but also benefit the stable development of the macro economy.Researchers have constructed a large number of relevant models in financial risk prediction,but these methods rarely take into account the imbalanced characteristics of financial data,which makes the recognition rate of minority samples low and the classification performance of the model poor.In view of this,this article uses an integrated learning algorithm to study the classification of unbalanced financial data,while supplementing the optimization theory,after empirical analysis,a better performance prediction model can be obtained.This leads to the problem of unreasonable classification.This article selects 124 listed manufacturing companies that were specially treated from 2014 to 2018 as the research sample.Instead of selecting the normal operating company according to the 1: 1 ratio to the traditional model,the ratio is 1: 3 based on the results of previous studies372 normal companies were selected to increase the authenticity of the original data set.From the profitability,solvency,growth ability,operating ability,cash flow,capital structure,equity governance structure and macroeconomic important factors,28 ratio indicators in 2 categories and 8categories were preliminarily selected.This article mainly focuses on the selection of unbalanced financial data classification forecast indicators and the construction and evaluation of models:(1)Construct a classification indexed system.Before establishing the model,first perform a statistical test of the initial indicators to eliminate indicators that do not have significant differences,and then use the K-meansalgorithm and gray correlation analysis method to further select the remaining indicators.The experimental results show that: after statistical test selection,a total of 16 classification indicators with obvious significant differences is obtained,and then after subsequent algorithm cluster analysis,6 classification indicators are finally selected and included in the classification indicator system,namely: net asset income Rate,return on assets,current ratio,interest protection multiple,asset-liability ratio,manufacturing index.(2)Build and evaluate financial classification models.Based on the constructed financial classification index system,through the selection of hyper parameters in the model,a financial classification model based on random forest and XGBoost algorithm is established,and the results obtained are compared with the classification effect of the traditional model.The experimental results show that the model based on the XGboost algorithm has the best performance among various indicators,with the highest accuracy reaching 93.29%,and the model based on the XGboost algorithm is comprehensively evaluated to be the best,proving that it has a strong financial classification Applicability,able to screen out as many companies with financial risks as possible.The summary of theoretical research and experimental verification results show that the model based on integrated learning can better deal with unbalanced financial data,classify financial data more accurately,and identify enterprises facing financial risks more efficiently.
Keywords/Search Tags:Unbalanced Financial Data, Ensemble Learning, Classification Indexed System, Random Forest, XGBoost
PDF Full Text Request
Related items