| Portfolio selection,aiming to scientifically allocating the investment proportion of a set of assets,has always been a basic research problem in quantitative investment.With the development of machine learning technology,online learning algorithms,as an important branch of machine learning,have been more widely used,and selecting portfolios through online learning algorithms has gradually become a significant research subject.At the same time,with the development of China’s economy,the liquidity and activity of the stock market has increased,and market information has also become more complicated.Under these circumstances,it is increasingly vital to absorb more information about the market and the listed companies and it is also significant to use online learning algorithms to construct portfolio strategies.The Multi-Armed Bandit algorithms are an important class of algorithms in the field of online learning.Shen et al.[28]constructed a portfolio strategy using the Multi-Armed Bandit algorithm and achieved more cumulative wealth than the equally weighted portfolio.Based on their research,this paper takes the components of CSI300 Index and SSE50 Index as the empirical objects,and uses the Multifactor Model to predict the stock returns and their covariance matrix,aiming to absorb more characteristics information of the listed companies.Considering the correlation of the factor returns in time series,the covariance matrix of stock returns is Newey-West adjusted.In addition,the strategy is improved by using the KL-UCB algorithm which has higher efficiency and stability.Specifically,first,valid factors are determined by conducting factor IC analysis and return analysis among 63 factors,and their Spearman’s rank correlation coefficient is calculated by category.The factors with correlation coefficient exceeding 0.7 are IC-weighted synthesized,which to form the final valid factors.The stock returns are predicted using moving average model,and the covariance matrix is estimated according to the Barra Model.Second,the covariance matrix is divided into the systemic risk part and the idiosyncratic risk part,and the UCB algorithm is used to select the optimal portfolios according to the Sharpe ratio,then make them synthesized and adjusted to obtain the final weight vector.Finally,in order to further improve the cumulative return of the strategy,the historical information is incorporated into the KL-UCB algorithm and the KLUCB algorithm is adapted to the strategy,then the two optimal portfolios are redetermined,which leads to a portfolio strategy based on the Multifactor Model and the KL-UCB algorithm.The backtest result shows that the improved strategy using the KL-UCB algorithm is able to significantly improve the Sharpe ratio and cumulative return compared to the CSI300 Index,SSE50 Index and their respective equally weighted portfolios,and is able to control the maximum drawdown within 20%,proving that the strategy is able to achieve higher cumulative return while controlling the maximum loss.Furthermore,the cumulative return of the improved KL-UCB-based strategy almost leads in the backtest interval,indicating that the strategy is able to obtain stable return.Thus,this paper demonstrates the rationality of the portfolio strategy based on the Multifactor Model combined with the Multi-Armed Bandit algorithms and also demonstrates the effectiveness of the portfolio strategy based on the KL-UCB algorithm in the China stock market. |