| Metro system has become a significant part in public transportation system because of its advantages on speed,capacity,safety and reliability,etc.However,the energy consumption of metro system is growing rapidly because of the increasing scale and decreasing operation headway,making metro system to be a large consumer of industrial electricity.According to the statistical data,traction energy of the vehicles accounts for about 35% to 45% of total energy consumption.Therefore,it is of great significant to optimize the driving strategies of the vehicles to cut down the energy cost of the metro system and construct an environmental-friendly public transportation system.In this paper,two core research ideas are proposed to determine the energy-efficient driving strategy according to the characteristics of energy-efficient train control(EETC)problem in metro system.Firstly,a common idea for solving EETC problem is determining a driving strategy which can minimize the traction energy consumption within the scheduled trip time.However,the trip time is introduced as an intermediate variable for determining the minimum energy consumption,making the solving procedure complicated.Aiming on the problem,the inverse problem of EETC is proposed on the basis of the unique correspondence among optimal traction energy and scheduled trip time,i.e.,minimizing the trip time of the journey with constant energy consumption.The energy-efficient driving strategy can be directly determined by solving the inverse problem,which can simplify the solving process.Secondly,a data-driven method is applied to determine the energy-efficient driving strategy,which can make fully use of the operational data and omit the complicated modelling process.Furthermore,a large amount of repetitive operational data of metro system can be generated due to cyclic timetables,which is suitable for reinforcement learning(RL)approaches to learn the driving experience.According to the analyzes above,an EETC method based on RL approach is proposed in this paper.Specifically,the research work of this paper can be summarized as follows:(1)The inverse problem of EETC problem is proposed.On the basis of the inverse problem,an energy-distribution based method for determining the energy-efficient driving strategy is introduced,which meets the definition of finite Markov decision process and can be solved with RL approaches.According to the characteristics of EETC problem,three key factors of RL framework,i.e.,state,action and reward,are defined.(2)An EETC method based on Q-Learning approach is proposed.In order to save and update the value function in tabular form,two representation approaches of state,which are on the basis of the trip time and energy distribution state respectively,are proposed.Furthermore,the effectiveness of the proposed method is verified and the parametric sensitivity analysis is given through the numerical experiments.(3)In order to determine the energy-efficient driving strategy in large state space,an EETC method based on deep Q-network(DQN)is proposed.The proposed method can optimize the driving strategy among multiple intervals,which adjusts the trip time of each interval while improving the energy-saving performance.Meanwhile,the impact of the hyper-parameter and structure of the network on the performance of the proposed method is discussed.(4)In order to improve the decision efficiency in large action space and reduce the training time of DQN-based approach,an EETC method based on soft actor-critic(SAC)approach is proposed.Compared with the Q-Learning-based approach and DQN-based approach,the proposed method can reduce 95.97% and 75.03% of the training steps respectively,which demonstrates its advantage on sample complexity.Moreover,the convergence and stability of the proposed method is analyzed through numerical experiments.There are 39 pictures,25 tables,and 98 references in this paper. |