Font Size: a A A

Portfolio Management Research Based On Multi-Agent Reinforcement Learning

Posted on:2023-01-03Degree:MasterType:Thesis
Country:ChinaCandidate:Q D HeFull Text:PDF
GTID:2558307061455724Subject:Management Science and Engineering
Abstract/Summary:PDF Full Text Request
In recent years,with the rapid development of financial trading systems,a large number of trad-ing activities are dominated by algorithmic trading.The automatic trading system based on deep reinforcement learning is a research hotspot,which has extensive application value and academic re-search significance.At present,many scholars have invested in related research,and many financial transaction algorithm models have come out.However,many models cannot respond quickly and adapt to changing market conditions,so that the stability of returns cannot be guaranteed when having a certain level of profitability.Therefore,this thesis establishes a portfolio management model based on multi-agent reinforce-ment learning.Based on the deep deterministic policy gradient algorithm,this thesis proposes a new policy network,which uses the multi-agent reinforcement learning and reward shaping to improve the framework,so that the model has different optimization exploration directions under different market conditions,it trades automatically through market conditions and historical data,taking into account both profitability and risk control.The main tasks of this thesis are as follows:1.Aiming at the problem of portfolio management,a new policy network(Multi-LSTM-IIE)is proposed in the reinforcement learning model.The policy network consists of the long short-term memory network and the independent evaluator.The input of the reinforcement learning model is composed of the price feature tensor and the portfolio weight of the previous period,where the price feature tensor is composed of the change of the opening price,the highest price,the lowest price and the closing price,and the output is the new portfolio weight.The new policy network can handle the impact of different price characteristics on asset weight distribution.Different assets flow inde-pendently in the network,but network parameters are shared.The experimental results show that Multi-LSTM-IIE can learn market knowledge from multi-dimensional price features.Compared with traditional investment models,the model in this thesis can achieve greater benefits.At the same time,it proves that deep reinforcement learning is applied to asset portfolio management problems.2.Due to the sensitivity of reinforcement learning models to hyperparameters and reward func-tions,hyperparameters greatly affect the performance of the model,and the reward function affects the model exploration direction and total return.This thesis explores the state time window length and reward discount factor,and compares the impact of two reward functions on the performance of the model.The hyperparameters of the model are determined through experiments,and it is concluded that the model using the daily logarithmic rate of return as the reward function is more profitable,and the model using the Sharpe ratio as the reward function has better risk control ability,but none of the models can do both.3.In order to make the model take into account both profitability and risk control,the agent’s reward is shaped.Combined with previous scholars’ research and the experimental experience of this thesis,a reward function that integrates the daily logarithmic rate of return and the Sortino Ratio is proposed,which is a linear weighted combination of the two indicators.Experiment determines the weights that give the model optimal performance.4.Since the financial market is extremely unstable and the market environment is changing rapidly,this thesis proposes a portfolio management model based on multi-agent reinforcement learn-ing(MAPM).The model sets up three trading agents with the same structure but different parameters according to the market trend classification.The agents use Multi-LSTM-IIE as the strategy network and have different reward functions and independent experience pool,this enables it to obtain different optimization exploration directions.Each agent performs policy optimization independently without interfering with each other.In the experiment compared with a single agent,MAPM wins in various evaluation indicators,with an average annual rate of return of 46.59% and a maximum drawdown of only 3.85%,which maximizes benefits and minimizes risks.This thesis applies deep reinforcement learning to the problem of portfolio management in the stock field,proposes a policy network suitable for portfolio management,and improves the model using multi-agent reinforcement learning methods and reward shaping.The trading model can reason-ably represent the market state from the historical data,put forward the optimal strategy for different market environments,and obtain stable income growth in the face of the changing financial market.Therefore,the research in this thesis has good practical value.
Keywords/Search Tags:Reinforcement Learning, Deep Learning, Portfolio Management, Multi-Agent, Reward Shaping
PDF Full Text Request
Related items