| In recent years,as the continuous promotion of electricity market reform,the generation and sale side have gradually been liberalized in an orderly manner to optimize the allocation of resources and promote healthy competition in the electricity market.However,generation companies(GENCOs),as market players in the new situation,are facing multiple uncertainties such as load demand and network congestion.And it is usually hard for the GENCOs to make the optimal strategies in order to maximize their private profits due to the incomplete market information.Therefore,the establishment of generation-side bidding decision model and the study of GENCOs’ optimal bidding strategies have become an important issue in the field of electricity market.In addition,GENCOs can enter the electricity selling side market and engage in electricity selling business,as the deepening of electricity selling side market reform and the gradual expansion of electricity selling business.The opening of electricity selling side has strengthened the connection between GENCOs and electricity retailers in the business field.In this context,it is of great significance to explore the interactive decision-making behavior between generation and sale side to improve social welfare and enhance electricity market operation.Therefore,the study of bidding decision-making strategies for GENCOs considering the bidding behavior of retailers has become another key issue that needs to be addressed urgently in the electricity market field.With the expansion of power system scale and the increasing penetration of new energy,the problems of high dimensionality,complexity and nonlinearity in the electricity market are increasingly prominent.In this context,the traditional bidding strategy methods such as cost analysis method,forecasting market electricity price method,game theory method,and estimating the bidding strategy of competitors show certain shortcomings.Reinforcement learning algorithms can effectively solve the problems of multi-player games,incomplete information and market uncertainty encountered by traditional optimization methods.Therefore,they are widely used in the field of bidding optimization decision-making in electricity market.However,the following problems still exist in the current research on the bidding strategies of GENCOs based on reinforcement learning:1)the market clearing model in the current study is relatively simple;2)the reinforcement learning algorithm currently used is prone to the problem of convergence difficulties due to the unstable environment;3)there is a lack of effective research on the interactive decision-making behavior of both GENCOs and retailers at present.In order to solve these problems,this paper optimizes and improves traditional optimization strategies from three aspects to improve the performance of the bidding optimization decision model:(1)Firstly,in the design of electricity market clearing model,this paper establishes a spot market clearing model containing day-ahead market and real-time market based on the security-constrained unit combination and security-constrained economic dispatch algorithm.Then,the spot market clearing model is used as the external environment of the multi-agent model,and the dynamic process of bidding decision for GENCOs is embedded in the spot market clearing model to better record the dynamic evolution process of bidding decision.(2)Secondly,in the selection and improvement of reinforcement learning optimization algorithms,this paper explores various algorithms and multi-agent environment,and proposes a multi-agent deep deterministic policy gradient(MADDPG)algorithm to optimize the bidding strategy of GENCOs.This algorithm allows the agents to consider the action choices of other agents when updating their own parameters,thus can improve the training effect of each agent in multi-agent environment,and can effectively solve the problem of convergence difficulty caused by unstable environment in traditional reinforcement learning algorithm.(3)Finally,in terms of the bidding strategy of the GENCOs considering the strategic bidding of retailers,this paper establishes a bidding model for generation-sale side based on slightly altruistic utility,and builds a spot market clearing model for bilateral bidding.Then,the interactive decision-making problem between GENCOs and retailers is modeled as a Markov game process,and an improved MADDPG algorithm is proposed to simulate the dynamic evolution process of bidding behavior between GENCOs and retailers.Through multiple iterative training,the optimal bidding strategies and market equilibrium are finally found. |