Font Size: a A A

A Preliminary Study On Quantitative Multi-factor Stock Selection Based On Machine Learning

Posted on:2020-08-23Degree:MasterType:Thesis
Country:ChinaCandidate:L Y LiFull Text:PDF
GTID:2439330572488356Subject:National Economics
Abstract/Summary:PDF Full Text Request
Quantitative investment is an investment method that extracts effective experience from historical data and regards it as guidance in the future.Unlike traditional subjective investment,quantitative investment is not affected by human subjective mood fluctuations,thus maintaining the stability and consistency in investment logic.Being an important strategy in quantitative investment in the equity,multi-factor stock selection has always been the focus of academy and industry.Inspite of the traditional multi-factor stock selection models' concentrating on the linear relationship between the returns of underlying assets and the explanatory varibales,the first part of this paper established a stock picking strategy based on decision-tree-based random forest and XGBOOST emsembling algorithm models using the A-share market data from April 2006 to June 2018.The above strategies were separately tested with pure-long strategy and market-neutral strategy.The 18-month rolling training random forest model achieved 23.8%annualized return and 0.673 Sharpe ratio in pure-long strategy as well as 26.7%annualized return and 1.447 Sharpe ratio;the 18-month rolling training XGBOOST model achieved 26.8%annualized return and 0.782 Sharpe ratio in pure-long strategy as well as 28.9%annualized return and 1.594 Sharpe ratio in market-neutral strategy.Both strategies performs far better than the benchmarks of the CSI 300 and the CSI 500 Index.In the second part we propsed a XGBOOST model based on focal-loss function which is expected to perform better on hard-to-classify observations but gailed.As a result,a model esembling method based on different-length-training-set models is proposed,which effectively improves the model's ability to respond to market changes.Based on the fact that the overall A-share market style in China is stable,this paper relaxes the assumption that the observations in the training set and test set be identically independently distributed.Therefore,the validity of the idea that cross-sectionalization of time series data is still to be verified in the future.
Keywords/Search Tags:Multi-factor Model, Random forest, XGBOOST
PDF Full Text Request
Related items