Font Size: a A A

Application Of A New Unbalanced Data Processing Method In Stock Classification

Posted on:2019-01-05Degree:MasterType:Thesis
Country:ChinaCandidate:Y Q ZhangFull Text:PDF
GTID:2429330545472123Subject:Statistics
Abstract/Summary:PDF Full Text Request
With the development of China's financial market,more and more investors look into the stock market.How to conduct stock analysis scientifically is the most important issue that every investor should do.In order to do stock selection,from the view of company fundamentals,the financial indicators of listed companies are particularly im-portant.However,when we make the stock picking,the number of high-quality stocks is much smaller than that of ordinary stocks,that is,the data set is unbalanced.The number of high-quality stocks is much smaller than ordinary shares,and the company's financial data is often high-dimensional,contains some irrelevant features,so it is nec-essary to balance data sets and do feature selection.In this paper,we innovate the borderline-SMOTE algorithm and the ADASYN al-gorithm.We propose a hybrid BASMOTE algorithm,based on the borderline-SMOTE,we introduce the ADASYN algorithm's adaptive idea,and a new minority sample is adaptively synthesized according to the distribution of boundary samples.Fewer sam-ples are synthesized in easily classified areas and more samples are synthesized in hard-er classified areas,so as to obtain a more effective and reasonable new minority sam-ple.Secondly,a hybrid feature selection method HPMG is proposed,which introduces the idea of wrapper feature selection into the three filter feature selection methods.We use the training accuracy of the classifier as the basis for determining the number of features for each filtering feature selection method and using the simple voting algo-rithm in the integration algorithm to determine the final result.In this paper,we use stock financial data of one industry of listed A shares,SVM is used as a classifier,the validity of the BASMOTE algorithm and the hybrid feature se-lection method HPMG are compared respectively with several existing over-sampling methods and feature selection methods.It proves that the BASMOTE algorithm and hybrid feature selection method HPMG are better than the existing oversampling meth-od and feature selection method.
Keywords/Search Tags:stock selection, imbalance data, BASMOTE, feature selection, HPMG
PDF Full Text Request
Related items