| In modern financial research,the problem of portfolio selection plays a very important role.Meanwhile,in modern portfolio theory,the research on the mean-variance optimization problem in which the mean value reflects the reward and the variance reflects the risk is an important part.In this paper,we use Markov decision process which is very popular recently to solve the mean-variance optimization problem with delay term in infinite time domain.The main contents of this paper are as follows:(1)We give the research object of the mean-variance optimization problem with time-delay term in infinite time domain,and analyze the specific expressions of its components.(2)Using the theory of sensitivity-based optimization,we analyze in infinite time domain,when the markov chain used for transition reaches a stationary distribution,what sufficient conditions are needed for the establishment of the corresponding magnitude relationship under the driving of different strategies in the mean-variance optimization problem.And we give the necessary conditions which the optimal strategy should satisfy.(3)We give the definitions of two types of strategy spaces: mixed strategy space and random strategy space.According to the previous analysis,we give a strategy search algorithm of the optimal strategy and analyze the local optimality analysis of the strategy obtained by the convergence of the strategy search algorithm in two types of strategy spaces.(4)We give an example to verify the feasibility of the strategy search algorithm. |