Research On Energy Cooperative Optimization Strategy Of Electric Energy Router Based On Deep Reinforcement Learning

Posted on:2024-05-18

Degree:Master

Type:Thesis

Country:China

Candidate:Y Li

Full Text:PDF

GTID:2542307172470664

Subject:Electronic information

Abstract/Summary:

PDF Full Text Request

With the continuous progress of industrialization and the consumption of large-scale fossil fuels,energy crisis has become a huge challenge facing countries around the world.In order to address this challenge,countries are actively promoting energy structure optimization and continuously advancing energy green and low-carbon transformation.In recent years,renewable energy has gradually penetrated into the distribution network due to its characteristics of cleanliness,low carbon,and no pollution,becoming an important way to achieve energy conservation and emission reduction and to achieve carbon peaking and carbon neutrality goals.However,distributed energy output has issues such as uncertainty and intermittency,which can significantly affect the stability of new power systems.Energy management technology based on reinforcement learning is widely used due to its ability to improve the reliability,safety,and economy of the power system,traditional reinforcement learning methods face issues of slow learning speed and sparse rewards when dealing with the energy management of complex power systems.Therefore,this thesis has improved traditional reinforcement learning methods to improve the effectiveness of energy management strategies.The main research contents of this thesis are as follows:The energy optimization management strategy of the deep deterministic policy gradient algorithm was studied.First,the electric energy router system model was established with energy balance and minimum electricity cost as optimization objectives.The optimization problem was transformed into a Markov decision process,and the state space,action space,and reward function were defined.Secondly,sparse rewards can cause most samples in the experience pool to not receive rewards,resulting in low sample utilization efficiency.To solve this problem,this thesis proposes a deep deterministic policy gradient algorithm based on priority experience replay,which improves the experience replay mechanism to solve the issue of sparse rewards and improve sample utilization efficiency.Finally,the simulation comparison analysis with other algorithms verified that the algorithm has better stability and effectiveness in solving sparse rewards problems.The above research mainly solves the problem of sparse rewards by improving the experience replay mechanism.However,when dealing with high-dimensional spaces and complex environments,reinforcement learning methods based on improved experience replay mechanisms still face the problem of difficult reward function design,and unreasonable reward function design can lead to slow learning and poor robustness of the algorithm.To address this issue,this thesis proposes a generative adversarial imitation learning method,which combines imitation learning with generative adversarial networks to avoid the design of complex reward functions and solve the issue of sparse rewards.In order to improve the exploration ability and learning efficiency of the agent,a generative adversarial network structure based on expert policies is adopted,and the optimization process of imitation learning is realized through continuous confrontation and game between the discriminant network and the generative network.Finally,simulation analysis and comparison prove that the generative adversarial imitation learning method has faster learning speed and stronger robustness,and verify the effectiveness of the method for energy management.

Keywords/Search Tags:

Deep Reinforcement Learning, Electric Energy Router Energy Management, Sparse Reward, Prioritized Experience Replay, Generative Adversarial Imitation Learning

PDF Full Text Request

Related items

1	Research On Energy Management Strategy Of Fuel Cell Hybrid Vehicle Based On Improved Soft Actor-Critic Algorithm
2	Reinforcement Learning Algorithm Based On Generative Adversarial Networks And Its Application In Building Energy Conservation
3	Multidimensional Path Planning For UAV Based On Deep Reinforcement Learning In Urban Environments
4	The Research On Anthropomorphic Decision-making Of Highway Car Following Behavior Based On Imitation Learning
5	Research On Imitation Learning And Its Applications In Autonomous Driving
6	Dynamic Economic Dispatch Research Of Power System Considering Uncertainties Of Renewable Energy Forecast Error
7	Deep Reinforcement Learning Based Electric Vehicle Charging Control And Residential Energy Management
8	Flight Conflict Resolution Method Based On Deep Reinforcement Learning
9	Research On Building Energy Efficient Method Based On Parallel Reinforcement Learning
10	Research On Map-Based Representation And Imitation Learning For Intelligent Connected Vehicle Trajectory Prediction Method