Font Size: a A A

Computational modeling of learning in complex problem solving tasks

Posted on:2009-09-15Degree:Ph.DType:Thesis
University:McGill University (Canada)Candidate:Dandurand, FredericFull Text:PDF
GTID:2447390002993940Subject:Psychology
Abstract/Summary:PDF Full Text Request
The information processing theory of problem solving has emphasized search and heuristics and comparatively neglected learning, a situation that this thesis addresses. Participants learn to solve problems using environmental feedback, verbal instructions, or demonstrations performed by experts. Empirical and simulation work confirms that demonstrations and instructions are more effective for learning than binary feedback (answer correct or not). Results also show that humans successfully generalize what they learn by observation to more complex tasks, suggesting understanding rather than rote memorizing of the solutions observed.;Four computational models of complex problem solving are presented. First, a reinforcement learning model is trained on binary environmental rewards only (RL-SDCC-SARSA) to simulate the binary reinforcement condition. It can learn the task with enough training but is less accurate than humans given equivalent learning. We argue that this is evidence that humans may be using distance to goal, look-ahead search, and reasoning. Second, a supervised cascade-correlation neural network (SL-SDCC) model learning from demonstrations successfully captures human accuracy in the imitation learning group. Third, a reinforcement-based model with direct policy training (RL-SDCC-DPT) learning from demonstrations also captures imitation learning group accuracy. Finally, a supervised knowledge-based cascade-correlation (SL-KBCC) model with selection rules as prior knowledge successfully captures performance of the verbal instructions group. This model builds more compact networks than SL-SDCC that also train faster.;All four models presented use cascade-correlation networks, which are either trained directly (SL-SDCC and SL-KBCC) or used as function approximators for expected rewards (RL-SDCC-DPT and RL-SDCC-SARSA). In the latter models, a second layer involves learning target expected rewards. SARSA converts environmental rewards into target rewards, and direct policy training (DPT) converts demonstrations into target rewards. Reinforcement-based models are more complex and costly to train than supervised systems, but they cover more cognitive phenomena in a single unified and parsimonious system, including the use of problem variants, exploration, and working memory limitations. Promising ideas are proposed to extend reinforcement-based models: distance-based rewards (DBR), which involve using distance to goal as a self-generated reward; look-ahead search; and intrinsic exploration by adding randomness to the action selection system.
Keywords/Search Tags:Problem solving, Learn, Model, Search, Complex
PDF Full Text Request
Related items