| With the developing of the society and technology, the potential analysis and optimization of discrete event dynamic system (DEDS) has become an advanced study aspect in cross field of control and system, management and computer. The semi-Markov decision process (SMDP) can analysis most of system in society. Motivated by the needs of the application, the optimization of SMDPs has been one of research focuses in the control field. Performance potentials theory provides a unified framework for SMDPs, optimization.This paper is concerned with the asynchronous optimization problems of semi-Markov decision processes (SMDPs) with compact action set based on the performance potential, and all the algorithms for both discounted and average performance criteria. First, the unified standard value iteration (VI) algorithm based directly on the equivalent infinitesimal generator A_a~v is considered, and the convergence is established. Second, the unified asynchronous VI algorithms including Gauss-Seidel iteration algorithm and asynchronous VI algorithm based on the simulation of a sample path. Then, according to the performance potential theory, the corresponding modified VI is discussed. The above results will be applicable to continuous-time Markov decision processes.The traditional theoretical algorithms can compute quickly and the obtained results are precision, but can usually not be used to optimize large-scale system and the system with not many information. The simulation optimization algorithms such as temporal differences (TD) learning and neuro-dynamic programming (NDP) optimization algorithms and so on can solve the above problem. Based on these characteristic, the paper introduced the unified asynchronous policy iteration (PI), such as multistage lookahead policy iteration, multistage lookahead PI based TD learning and NDP. These algorithms are unified for both discounted and average performance criteria.At last one numerical example is used to show the different properties of the algorithms, the obtained results will be applicable to continuous-time Markov decision processes (MDPs).Based on the asynchronous algorithm, the paper introduction theoptimization simulation platform, the platform can input suitable parameter based on the system, and provide convenience for the performance optimization of most systems. |