Font Size: a A A

A Computational Neural Model Of Decision-making Based On Reward-modulated Plasticity

Posted on:2013-10-10Degree:DoctorType:Dissertation
Country:ChinaCandidate:Z B ChengFull Text:PDF
GTID:1220330392452124Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
The behaviors of human and animals often are involved in pursuit of reward. Re-ward is a essential signal in decision-making processes to drive learning. A fundamen-tal question in neuroscience concerns the decision-making processes by which animalsand humans select actions in face of reward. This thesis provides a normative computa-tional neural model of how reward affects decision-making processes under three typicaldecision-making tasks. More specifically, we explore the biologically realistic model toelucidate neural circuit mechanisms that explain how the recorded neural activity is pro-duced and how the model leads to the observed decision behaviors in these tasks.Based on the reward-modulated plasticity, the biologically plausible neural modelsare proposed in three different decision-making tasks. First, we put forward a decision-making model in which the policy is updated by a policy parameter on the basis of rein-forcement learning theory. Based on this model, a policy that satisfies the matching law isderivedundersomesimpleassumptions. Theoreticalanalysisandsimulationresultsshowthat the decision behavior achieved by the policy obeys the matching law. In addition, thematching behaviors in two classical experiments are reproduced using the policy. Ourresults provide a reasonable strategy for the matching law. Furthermore, we also predictthat our model might be implemented in the brain through the prefrontal cortex and thebasal ganglia neural circuit.Second, we propose a winner-take-all decision circuit in combination with reward-modulated learning rules to perform a decision-making task under the computation of thelog likelihood ratio (logLR). Learning in our model is updating of synaptic strength bystochastically jumping between two discrete synaptic states according to a reward sig-nal. Specifically, in the steady state of learning process, we indicate that the dynamicalchange of synaptic strength is a function of the posterior probability for correct responsegiven a stimulus, and the difference of synaptic weights encodes the logLR. In particular,our model can reproduce the main behavioral results achieved by Yang and Shadlen in amonkey’s experiment.Finally, we propose a neural model for a temporal discrimination task under reward-dependent synaptic plasticity. The task requires the working memory of the first stimulus(f1), followed by comparison of the second stimulus (f2) with the stored f1and a binary decision (f2> f1or f2<f1). Measurements of neural activity in working memory dur-ing the task show that the content of working memory is not only stimulus dependent butalso strongly time varying. A decision circuit endowed with reward-modulated synapticplasticity makes choices on the basis of a reservoir-like neural circuit which implementsthe working memory and comparison. The comparison result encoded in the activities ofthe reservoir can be read out through a sparse network. The observed decision behaviorsin the discrimination task are reproduced using the neural model. In particular, the modelis validated by reproducing salient observations of responses of neurons in monkey’s ex-periments.
Keywords/Search Tags:decision-making, reinforcement learning, winner-take-all, neural circuit
PDF Full Text Request
Related items