Font Size: a A A

Research On The Control Method Of Building Indoor Environment Based On Inverse Reinforcement Learning

Posted on:2022-01-24Degree:MasterType:Thesis
Country:ChinaCandidate:S B WuFull Text:PDF
GTID:2492306557957829Subject:Master of Engineering
Abstract/Summary:PDF Full Text Request
With the emergence of Sick Building Syndrome,people have begun to realize that a closed indoor environment will have an adverse effect on the health of indoor personnel,and great attention has been paid to the comfort of the indoor environment of buildings.In addition,due to the increasing consumption of fossil energy,the demand for electricity in society continues to increase.Based on the large electricity energy consumption in building operations,how to reduce energy consumption while ensuring comfort has become an important research direction in related fields.This paper takes the building indoor environment control problem as a scenario,combines reinforcement learning to solve the control optimization problem,and focuses on the study of the reward function setting problem in reinforcement learning.Aiming at the problem that the reward function is difficult to set artificially in the complex indoor environment,this paper uses the inverse reinforcement learning method to set the reward function and obtain the optimal strategy.At the same time,the algorithm is applied to the actual air-conditioning system.For the small sample problem existing in the simulation process,the concept of meta-learning is introduced,the reward function is modeled and solved through the relative entropy probability model,and the proposed algorithm is applied to the automatic control of the air-conditioning system,thereby realizing the regulation of the indoor environment purpose.The main research contents are as follows:(1)Aiming at the problem that traditional inverse reinforcement learning algorithms are slow,imprecise,or even unsolvable when solving the reward function owing to insufficient expert demonstration samples and unknown state transition probabilities.A meta-reinforcement learning method based on relative entropy is proposed.Using meta-learning methods,the target task learning prior is constructed by integrating a set of meta-training sets that meet the same distribution as the target task.In the model-free reinforcement learning problem,the relative entropy probability model is used to model the reward function and combined with the prior to achieve the goal of quickly solving the reward function of the target task using a small number of samples of the target task.The proposed algorithm and the RE IRL algorithm are applied to the classic Gridworld and Object World problems.Experiments show that the algorithm can still solve the reward function better when the target task expert demonstrates sparseness and lacks state transition probabilities information.(2)Aiming at the problem that the reward function is difficult to set artificially in the indoor environment control task MDP,an indoor environment control method based on apprentice learning is proposed.Expert samples are constructed by collecting the control sequence of human experts,and the reward function is solved by combining the apprentice learning method.With the continuous iteration of the reward function parameters,the optimal control strategy close to the strategy of the human expert is solved,and the proposed method is applied to the simulated indoor environment model for simulation experiments.The experimental results show that the method is in a data-driven manner Effectively solve the reward function setting problem in the indoor environment control task MDP,and further adaptively control the indoor environment.(3)Apply inverse reinforcement learning to the automatic control of air-conditioning systems in reality.Aiming at the problem of insufficient number of samples in the modeling process,combined with artificial neural networks and the third chapter based on the relative entropy-based meta-inverse reinforcement learning method to build an air-conditioning intelligent control method,and the changes of system performance under different reward functions are studied.The proposed method was verified on a large database,and the experimental results show that the method can still achieve the intelligent control of the AC system in the case of a small number of samples.Moreover,by providing control samples of different users,it will eventually meet different requirements.The control system set by the user’s preference has certain practical significance.
Keywords/Search Tags:Reinforcement Learning, Inverse Reinforcement Learning, Meta-learning, Indoor Environment, Automatic Control of Air Conditioning
PDF Full Text Request
Related items