Font Size: a A A

Portfolio Model Construction And Risk Measurement Based On Adaptive Multi-armED Bandit Algorithm

Posted on:2021-01-17Degree:MasterType:Thesis
Country:ChinaCandidate:Y FangFull Text:PDF
GTID:2370330623958806Subject:Finance
Abstract/Summary:PDF Full Text Request
The construction and selection of investment is not only a basic research issue of econometric finance,but also a practical engineering task in financial engineering.Its purpose is to optimize the wealth distribution of assets.Markowitz firstly put forward the formal model of portfolio selection in 1952.The model is based on the relationship between return and risk of assets,with the help of mean-square deviation analysis.It was suggested that the distribution scheme of maximizing expected return should be chosen at the risk level with variance as unit,and the selection of optimal portfolio was discussed.The model also let him win the Nobel Prize in Economics in 1990,and still attracts the attention of many investment companies.With the development of modern mathematical methods and the advent of financial mathematics,which studies financial and economic problems with mathematical methods,modern financial investment theory has begun to get rid of the state of pure empirical operation and simple descriptive research,and entered the advanced stage of quantitative analysis,which provides guidance for investors to make investment decisions.Nowadays,with the rapid development of the world economy,financial crisis and market fluctuations occur frequently.Although China's capital market has made great progress after the reform and opening-up,it is still not perfect and mature.As a result,investors are facing more and more complicated theoretical and practical problems in financial investment decision-making and portfolio optimization.Research is also becoming more and more important in theory and practical significance.In this paper,the portfolio selection problem is modeled as a sequential decision-making problem under uncertainty,and reinforcement learning technology is applied to portfolio selection.LinUCB(Linear Upper Confidence Bound)algorithm in reinforcement learning is used to construct the upper bound confidence interval for the risk in combination construction.The model maximizes the cumulative return during the experimental period by constructing the ratio between the return and the risk in each period of selection.At the same time,the decision tree algorithm is introduced to supervise the classification of stock pools before each period of selection,and form the portfolio to be selected.This paper will carry out research from the following three aspects:one is to construct the decision tree C4.5 model with improved parameters,the other is to construct the model of adaptive context dobby gambling machine,and the third is to adjust the parameters of portfolio optimization model and compare the results.The results show that,based on the utility maximization principle,this paper constructs an adaptive context dobby gambling machine algorithm portfolio model.The cumulative return during the experiment period is higher than that of the control group,which shows the effectiveness of this method.Further experiments show that the model keeps learning in the process of portfolio selection,so that the algorithm can understand investors more and more.Therefore,the closer the training data time is to the end of the experimental period,the better the performance of the portfolio is.
Keywords/Search Tags:portfolio selection, contextual multi-armed bandit algorithm, sequential decision-making, decision Tree algorithm, utility measure
PDF Full Text Request
Related items