| With the steady growth of the social economy,China’s electricity demand has risen,and problems such as the increase in power peak-to-valley gaps have become increasingly prominent.At the same time,with the rapid development of smart grids,user-side resources have become more proactive.This creates a good environment for using user-side resources to ease the pressure on power supply.In the context of China’s deepening of the power system reform,ensuring the sustained and sound development of demand response has further become the key content of the reform.It is of great significance to carry out relevant research on the implementation of demand response in the market environment for transferring peak power loads,ensuring the balance of power supply and demand,and improving the overall economic benefits of society.In the market environment,market entities such as power grid companies,electricity retailers,and users also put forward new demands for application scenarios of incentive demand response based on their own operational goals.At present,China has carried out demand response work for many years,which has played a huge role in dealing with power shortages.And,the existing research still lacks the consideration of the new demand response application demand of each subject in the market environment.Therefore,this paper studies the optimal strategies of power grid companies,electricity retailers and users to participate in incentive demand response,so as to promote the better development of China’s incentive demand response in the reform of the electricity market.First,a master-slave game model of demand response between the grid company and multiple users is constructed,and the optimization method of their respective demand response strategies is proposed.Among them,the power grid company can select an appropriate time period to implement demand response projects based on the annual load duration curve obtained one year in advance and the historical response behavior of users,so as to reduce the cost of transmission and distribution construction and improve the overall efficiency.Users can adjust the response amount of each time period according to the demand response subsidy price formulated by the grid company in the corresponding time period to maximize their own demand response profit.By analyzing the existence and solution ideas of the master-slave game model constructed,a solution method for game equilibrium is proposed.The results of the calculation example show that the constructed game model can achieve the set goals,and both the grid company and users can benefit from participating in demand response projects.In addition,the impact of avoidable transmission and distribution unit capacity costs on the demand response profit of power grid companies is analyzed.Then,an incentive demand response master-slave game model between a single electricity retailer and multiple users is constructed.The electricity retailer formulates a demand response subsidy strategy during the period when the spot market electricity price is higher than the electricity price set by the retailer to reduce the loss of electricity sales.The user determines the amount of response in the corresponding period according to the subsidy price set by the electricity retailer to obtain additional profit.Through analysis,the solution method of the game model is obtained.The results of calculation examples show that both electricity retailer and users can benefit from demand response.In addition,the impact of spot market price fluctuations on subsidy price setting,user response volume and their respective demand response profit is analyzed.And the changes in electricity retailer demand response profit when different types of users join demand response projects are also analyzed.Finally,a trading mechanism for power consumption rights among users is designed in a market environment,and the user power consumption right trading model is constructed.Among them,users can freely choose to participate in power consumption right trading as the purchaser or seller to maximize their own profits.Based on the Wo LF-PHC reinforcement learning algorithm,the power consumption right trading is simulated.The results of the calculation example show that when only a single user performs reinforcement learning,the user’s profit can quickly converge to the optimal solution.when all users perform reinforcement learning,the power consumption right trading profit of each user remains stable,the final market can reach equilibrium,and the social surplus formed by the market exchange is close to the maximum. |