Efficient Deep Reinforcement Learning For Video Games

Posted on:2023-01-27

Degree:Master

Type:Thesis

Country:China

Candidate:W T Yang

Full Text:PDF

GTID:2568306827475584

Subject:Software engineering

Abstract/Summary:

PDF Full Text Request

Deep Reinforcement Learning(DRL)is an automatic policy learning algorithm based on deep learning and a small amount of reward signals.Due to the huge demand of data for deep learning and the sparsity of reward signals,the amount of training data is increased to achieve DRL task objectives.Under the limitation of the interaction in real world,it is important to study Efficient DRL algorithms.Specifically,Efficient DRL methods train a policy with comparable performance by using as few interaction samples as possible.This research is divided into two aspects of data augmentation and training framework.Among them,data augmentation aims to construct new samples based on existing samples and expand the data set to improve the data efficiency over the original data.However,previous methods originated from the traditional image classification task and did not consider the semantic consistency of image states and actions in decision-making scenarios.In this research,Efficient DRL with Symmetric Consistency is proposed,which can realize correct augmentation by constructing semantically consistent symmetric transition samples.To fully combine the symmetric characteristics,the Symmetric Deep Q Network(Sym.DQN)is proposed and carries out the joint optimization on the original and the augmented sample,improving the data efficiency.Meanwhile,in the training framework aspect,previous algorithms adopt fixed updateinteraction ratios,resulting in under-training or over-fitting in different stages and tasks and limiting the performance.To solve this problem,the difficulty of fitting samples is measured according to the local loss standard deviation,and the dynamic threshold is constructed by the exponential moving average of the standard deviation.Based on the above settings,Efficient DRL with Flexible Update is proposed,which can adjust the update-interaction ratio according to the complexity of training samples to alleviate under-training and over-fitting,and reduce computational costs while ensuring the performance.The experiments are conducted in the Arcade learning environment with reference to the Atari 100 K video game benchmark.Experiments show that Symmetric Deep Q Network is superior to previous models,and Flexible Update Mechanism reduces training costs in different tasks and improves the overall data efficiency.

Keywords/Search Tags:

Reinforcement Learning, Deep Learning, Data Efficiency

PDF Full Text Request

Related items

1	Towards Sample-efficient Deep Reinforcement Learning
2	Research On Sample-efficient Deep Reinforcement Learning Methods
3	Sample Efficiency Improvement Method Of Deep Reinforcement Learning And Its Application In Video Bitrate Control
4	Research On Stock Trading Based On Deep Reinforcement Learning
5	Supervised Reinforcement Learning:methods And Applications
6	Research On 6G Multi-tier Network Resource Allocation Based On Deep Reinforcement Learning
7	The Research On Energy Efficiency Improvement Of Cloud Datacenter Based On Reinforcement Learning
8	Research On Sample Generation And Selection Methods For Deep Reinforcement Learning
9	Research On Security Deep Reinforcement Learning Based On Experiences
10	Deep Reinforcement Learning With Self-Generated Expert Samples