Optimal Time Scales for Reinforcement Learning Behaviour Strategies

Posted on:2011-07-13

Degree:M.Sc

Type:Thesis

University:McGill University (Canada)

Candidate:Comanici, Gheorghe

Full Text:PDF

GTID:2442390002467225

Subject:Artificial Intelligence

Abstract/Summary:

Reinforcement Learning is a branch of Artificial Intelligence addressing the problem of single-agent autonomous sequential decision making. It proposes computational models which do not rely on the complete knowledge of the dynamics of stochastic environments. Options are a formalism used to temporally extend actions towards hierarchically organized behaviour, a concept used to improve learning in large-scale problems. In this thesis we propose a new approach for generating options. Given controllers or behaviour policies as prior knowledge, we learn how to switch between these policies by optimizing the expected total discounted reward of the hierarchical behaviour. We derive gradient descent-based algorithms for learning optimal termination conditions of options, based on a new option termination representation. We provide theoretical guarantees and extentions of widely used Reinforcement Learning algorithms when options have variable time-scales. Finally, we incorporate the proposed approach into policy-gradient methods with linear function approximation.

Keywords/Search Tags:

Behaviour, Options

Related items

1	Research On Customer-Oriented Wide-Body Civil Airplane Options
2	Investigation Of Salt Leaching Effects On Mechanical Behaviour Of Natural Marine Clays
3	Study On Individual Travel Choice Behavior In Response To Traffic Information
4	Valuation Of Hydrogen-based Energy Storage System In Wind Power Generation:A Real Options Research
5	Analysis On Investment Decision Of Fuqing Nuclear Power Plant On Real Options
6	Finding the positive in a hostile world: Relationships between aspects of social information processing, prosocial behaviour, and aggressive behaviour, in children with ADHD and disruptive behaviour
7	Research On The Value Evaluation Of BYD New Energy Automobile Companies Based On Real Options
8	Value Evaluation Of New Energy Automobile Enterprises Based On Real Options
9	Study On Investment Value Of PPP Highway Projects Considering Governments Bilateral Guarantee Options
10	Software options for support of three-dimensional CAD in the architectural design process: A critical evaluation by practitioners and educators