| Railway vertical alignment design is one of key parts in railway routes design,and researchers have been committed to realizing the intelligent design of vertical alignment.However,the advantage of the traditional optimization-based method lies in the local optimization,optimization result sometimes depends on the quality of the initial plan,and it is difficult to approach or achieve the overall optimum.This thesis combines railway vertical alignment design along with reinforcement learning method to conduct research on the construction of an intelligent design model for railway vertical alignment based on deep reinforcement learning.The detailed contents and achievements are as follow:(1)A decision-making model for railway vertical alignment based on reinforcement learning is established.By analyzing existing methods,the suitable generation mechanism of alignments by decision-making process,and deep reinforcement learning algorithm based on policy gradient are studied to form the framework of the intelligent design model of vertical alignment.Furthermore,combined with the the“end-to-end” characteristics of deep reinforcement learning,the model designs step-slope-based actions,terrain-alignment-based states,as well as the reward functions based on optimization goals.(2)On the basis of the alignment decision-making model,integrated with the multi-constraint and multi-objective characteristics of vertical alignment design,a constrained multi-objective reinforcement learning decision-making model for railway vertical alignment is constructed.The multi-objective reinforcement learning algorithm is improved,the weight of multi-objective is incorporated into the state of the model,a single-network-multi-strategy deep reinforcement learning algorithm for multi-objective is proposed.In addition,an action-mask mechanism is introduced to deal with constraints,constraints of the vertical alignment design are used to calculate the safe action space,and the actions that will default in the future are masked out before agent makes a decision.(3)The influence of four hyper-parameters,namely network structure,learning rate,advantage function and cross-entropy weight in deep reinforcement learning algorithm,on model performance is analyzed.Through the experiments of simulation case,the training processes,training results,and the optimal solutions generated by the agent are analyzed,which provides guidance for subsequent experiments.(4)A case experiments are carried out from two dimensions of single-objective and multi-objective based on an engineering case.The model built in this thesis is compared with two existing optimization based vertical alignment design model.The experimental results prove the correctness and practicality of the model and improved algorithm. |