| In recent years,there has been significant interest in proton exchange membrane fuel cells(PEMFC)for applications in new energy vehicles,unmanned aerial vehicles,rail transportation,and marine transportation due to their high energy conversion efficiency,low operation noise,fast response speed,and low pollution.In the PEMFC system,the gas supply system plays a crucial role in supplying hydrogen,oxygen,and managing hydrogen recycle.It directly influences important parameters such as the oxygen excess ratio,membrane pressure difference,and hydrogen recycle ratio,and these parameters have a significant impact on the overall efficiency,lifetime,and economic cost of the PEMFC.The interdependence among variables in the gas supply system of PEMFC leads to a situation where controlling a single variable has a simultaneous effect on the control of other variables,resulting in a reduction in overall control performance.Therefore,it is of great significance to design the controller to realize the cooperative control of the oxygen supply,hydrogen supply and hydrogen cycle gas supply system to improve the efficiency of the fuel cell,prolong the service life and reduce the economic cost.However,the nonlinearity and parameter coupling present in the PEMFC gas supply system pose challenges in establishing an accurate control model and make it difficult to apply model-based control methods.With the advancements in intelligent control,the model-free control method,inspired by the decision-making processes of the human brain,has gained recognition for its effectiveness in controlling large,complex nonlinear systems.In this paper,we propose two cooperative control approachs for the PEMFC gas supply system,employing fuzzy control theory or reinforcement learning theory.The following details the specific contributions and innovations of our work:1)A simulation model for the PEMFC gas supply system was developed and model analysis was conducted.In order to investigate the internal characteristics of the PEMFC system,a semi-empirical and semi-mechanistic modeling approach was employed for the PEMFC gas supply system.The model was analyzed using a controlled variable method,revealing strong nonlinearity and parameter coupling within the PEMFC gas supply system.2)The coordinated control of the PEMFC gas supply system based on a fuzzy PID controller has been implemented.Considering the nonlinearity of the PEMFC gas supply system,a fuzzy PID coordinated control framework was constructed and the cooperative control task of the PEMFC gas supply system was accomplished.A comparative simulation between the fuzzy PID controller and the PID controller demonstrates that the fuzzy PID controller outperforms the PID controller in terms of minimizing error extremes and reducing the adjustment time.3)The coordinated control of the PEMFC gas supply system based on the improved Multi-Agent Deep Deterministic Policy Gradient(MADDPG)algorithm has been achieved.To address the issue of improper design of fuzzy rules affecting the performance of the fuzzy PID controller,a cooperative control framework for the PEMFC gas supply system was constructed using a reinforcement learning approach and modeled as a Markov Decision Process(MDP).Through the training of agents and coordinated control of the gas supply system using the DDPG and MADDPG algorithms,simulation experiments demonstrate that the reinforcement learning controllers can accomplish the coordinated control task.However,there is still significant room for improvement in terms of steady-state performance.4)The Variable Reward Curriculum Learning(VRCL)approach was proposed to optimize the training process and control performance of reinforcement learning controllers.Addressing the high failure rate of agent training and poor steady-state performance of the controllers in reinforcement learning methods,an analysis was conducted on the sparse reward phenomenon caused by the reward function,leading to the introduction of the Variable Reward Curriculum Learning approach.This approach was combined with the DDPG and MADDPG algorithms to train the agents.Simulation experiments demonstrate that agents trained using this approach exhibit better tracking performance.Among them,the VRCL-MADDPG algorithm achieves the lowest error extremes and the shortest adjustment time. |