| The financial data that characterizes the stock market environment to aid investment decisions is complex and contains a lot of nonlinear changes and noisy information.Whether using quantitative financial methods or traditional machine learning methods,it is very difficult to capture changes in market conditions from high-dimensional stock time series data to obtain efficient trading strategies.Considering the high noise,nonlinearity and multi-dimensional temporal characteristics of portfolio data,this paper proposes to improve the DDPG algorithm based on LSTM network and allocate assets to stock index portfolios.The research work mainly consists of the following three parts:Firstly,aiming at the defect that the simple MLP network cannot efficiently adapt to the change of time series trend,the stock price prediction model is constructed by using the LSTM network,and the quantitative return research on the dataset with different characteristics of the Shanghai Composite Index is compared with the MLP network,which verifies the superior performance of LSTM training and prediction on the stock time series data.At the same time,it shows that the selection of characteristic data is an important guarantee for quantitative research to obtain high returns.Secondly,due to the complexity and high noise of multi-dimensional financial data,we propose a multi-dimensional factor extractor based on multi-factor regression,which uses the validity test based on multi-factor model and correlation screening based on factor heat map to reduce the complexity of factor data after the completion of portfolio asset selection,and finally screens out 40 effective and low-correlation factors from high-frequency data containing noise,which is used as input state data in the deep reinforcement learning algorithm,which provides strong support for accelerating the training convergence speed and enhancing the model expression ability.Thirdly,to realize the intelligent management of asset portfolio,a quantitative investment strategy based on the portfolio trading model LSTM_DDPG was constructed and empirically studied.Firstly,traditional DDPG is difficult to accurately model complex multiple asset time series data,so the LSTM network is used as the internal network of the agent to better adapt to the market fitting profit action.At the same time,in order to help DDPG get rid of the dilemma of convergence,three reward functions are constructed according to the degree of risk aversion of investors to promote the exploration and development of agent balance.Then,the LSTM_DDPG trading strategy is deployed,the strategy performance under different network structures and reward functions is compared,and the three LSTM_DDPG strategies based on R1,R2 and R3 reward functions are compared with the buy-hold strategy,mean variance model,CSI 300 index and DDPG model.Empirical results show that LSTM_DDPG-based portfolio trading strategies have higher returns,Sharpe ratios,lower maximum drawdown and volatility,which can more accurately capture profit opportunities,provide reference for investors with different risk tolerances,and improve their ability to deal with market risks. |