| With climate change and users’ increasing demand for building functions,the proportion of building energy consumption among the three major social energy consumption is increasing year by year at a rapid rate.All countries regard building energy consumption control as the main goal of formulating energy policies.Building energy consumption mainly comes from the energy consumption generated by the continuous operation of building equipment,which not only increases the energy consumption cost of users,but also affects the indoor comfort with the increase of greenhouse gas emissions.Therefore,formulating effective control strategies to achieve energy saving and make users feel comfortable has become a key research content in the field of architecture.This paper considers the building energy consumption control for continuous action space and uses deep reinforcement learning to optimize the control strategy of building equipment.Carry out specific research on the following three parts:(1)In order to resolve the matter that conventional reinforcement learning takes more time exploring valuable data in the absence of external reward,which leads to slow speed and unstable convergence in early training,a deep deterministic policy gradient algorithm based on self-supervised network was proposed(SSN-DDPG).This method improves the deep deterministic policy gradient(DDPG)algorithm and designs a self-supervised network that takes into account more essential features of sample data in reinforcement learning problems.In the method,the input of actor network and critic network in DDPG algorithm is processed by feature extraction layer and forward model,and replace the algorithm input with the high-dimensional output obtained from the processing.Then,set the loss function in the process of training the neural network layer parameters,optimize parameters by a self-supervised manner that minimizes the prediction error,and the frequency of training data is increased,thus the convergence effect and stability of the algorithm are improved.The proposed SSN-DDPG algorithm and the original DDPG algorithm are used to solve the Mountain Car problem and the Pendulum problem.Comparing the two algorithms,the results show that the convergence speed of the proposed SSN-DDPG algorithm is better than that of the DDPG algorithm in both problems,and it has better stability performance.(2)Aiming at the situation that the current control strategy of building equipment is single and fixed,and more effective energy saving measures are not taken according to the building energy consumption and the surrounding environment,the proposed SSN-DDPG algorithm is applied to the building energy consumption control problem.The continuous setting action of construction equipment can be adjusted more flexibly by deeply mining the state data of construction equipment and the characteristics of environmental data around the building.Use artificial neural network to predict building energy consumption,provide data analysis for equipment controller to perform better control actions.What’s more,the energy consumption control problem is modeled as MDP problem,and the SSN-DDPG algorithm is used to optimize the control objective and solve the optimal strategy,so as to adjust the discharge temperature setting value of the air handling unit of the building energy consumption equipment.Experimental results show that the SSN-DDPG algorithm can reduce the building energy consumption significantly and satisfy the user’s comfort as much as possible.(3)As the control strategy studied in chapter 4 tends to reduce building energy consumption,the user’s comfort will be sacrificed to a certain extent,and the actual energy consumption cost is not considered in the process of building energy consumption control.Therefore,after thinking deeply about the research content,the building energy consumption control based on SSN-DDPG algorithm under real-time electricity price is proposed.In chapter 4,building energy consumption will continue to increase in the pursuit of building environment comfort,but the combination of real-time electricity price can provide reference for improving users’ comfort level and reducing energy consumption cost.Combined with historical data and current environmental information,electricity price is predicted by long and short-term memory recurrent neural network algorithm,which provides state information for Markov decision process(MDP)to better learn control actions.Then,the SSN-DDPG algorithm is used to train the agent to learn the optimal control strategy and solve the problem of building energy consumption control,and set the discharge temperature for the air handling unit of the building energy consumption equipment.Experiments show that this method can keep lower energy consumption cost,satisfy user’s comfort more,and realize the effect of building energy saving as a whole. |