Font Size: a A A

Reaserch On The Predicting Of CSI300 Index Movement Based On Random Forest Method

Posted on:2018-08-28Degree:MasterType:Thesis
Country:ChinaCandidate:W WangFull Text:PDF
GTID:2359330542967727Subject:Finance
Abstract/Summary:PDF Full Text Request
In the area of modern intelligent technology,machine learning is a critical tool of datamining.As one of the most significant algorithm among machine learning methods,random forest combines Bagging algorithm and CART to create a integrated forest algorithm with decision tree as its basic classifiers,inherently belonging to a ensemble learning category.Numerous research confirm that random forest break through the bottleneck of single classifiers by using the method of ensemble and manifest a superior performance in classifying and predicting.Also because the algorithm requires less settings of parameters,ensures very high efficiency,tolerates noise and has fewer degree of over-fitting,it is widely used in the areas of IT,bio-tech,medical science,image recognition and finance for classifying and predicting.Especially because of the dynamic,non-linear and non-parametric chaos traits that stock market owns,random forest performs better when research those traits comparing to the traditional data analyzing methods.Ergo the algorithm is escalatingly paid attention to.This paper introduces the theory of ensemble learning and random forest,and selects 16 technical indicators based on previous research experiences.Then uses them as input variables or training random forest models and optimizes the random forest by testing parameters settings and variables selection.Finally,applies the random forest model to predict the movement trend of CSI300 index and tests the predicting precision of the optimized model.Also,we build a traditional parametric model,Logistic,for comparing it to the random forest and researching the predicting and variables selecting capability of random forest.It shows that random forest performs much better than Logistic predicting stock index movements on the same dataset.Also,adopts a duplicate test method to compare the predicting performance under different parameters settings to certify the number of decision trees ntree and the number of random features mtry.Using the different criterion of random forest variables selection to optimize the model can significantly increase its predicting capability and simplify the model complexity.And,random forest variable selection method can successfully optimize the Logistic model,which means it is generalizable to a certain extent.
Keywords/Search Tags:Random Forest, Technical Indicators, Stock Index Predicting, Logistic model
PDF Full Text Request
Related items