Font Size: a A A

Study On Reinforcement Learning Of Pigeons' Visual-Behavioral Decision-Making

Posted on:2018-06-27Degree:MasterType:Thesis
Country:ChinaCandidate:M Y TaoFull Text:PDF
GTID:2310330515473237Subject:Control theory and control engineering
Abstract/Summary:PDF Full Text Request
Behavioral decision-making(cognitive execution)is necessary,which can be used by agents to exist from the survival of the fittest through the judgment of the external information.Visual is the main source of external information,occupying more than 80% of all the sensory information.In nature,the majority of the visual-behavioral decision-making vital for agents is accumulated through acquired learning(reinforcement learning).Pigeons have become experimental typical models for their strong vision and the ability of decision-making no less than that of mammals.Therefore,that conducting study of reinforcement learning on the pigeon's visual-behavioral decision-making is of great significance to explain the cognitive mechanism of agents in the process of decision-making,to understand the brain mechanism of agents' decision-making and to deepen our understanding of the working principle of the brain's cognitive decision-making.Although many achievements on the pigeons' visual-behavioral decision-making have been made,those achievements focus on the reinforcement learning that under static rule.And the algorithm of reinforcement learning is so simplistic that the fixed leaning rate and single reward matrix can't simulate the agents' behavioral decision-making under dynamic environment rules suitably.In addition,the role of NCL neurons in the process of reinforcement learning is not clear.Thus experimental paradigms of visual-behavioral decision-making under dynamic rules are designed,then pigeons' behavioral trainings are carried out and neuronal signal are collected in this paper to study the mechanism of behavioral decision-making and response features of neurons in NCL from the perspective of behavior and neuron response respectively.The work is as follows:(1)Two experimental paradigms of visual-behavioral decision-making under dynamic rules are designed.Two paradigms of visual-behavioral decision-making are designed,which are random and reversal experiments.According to the proposed experimental process,the platforms of hardware and software of behavioral trainingare built to achieve the automation training of pigeons based on the specific information of reward and punishment.Neural signals of NCL are collected synchronously and the pretreatment of neural signals are completed.(2)A new dynamic reinforcement learning model is proposed.Through improving the classical Q-Learning model in the aspects of learning rate and reward matrix,dynamic reinforcement learning models are proposed in this paper to simulate the pigeons' behavior obtained from two kinds of experiments.And the results of dynamic models-predictive behavioral errors and learning rates are compared with the results of classical Q-Learning models.It is found that compared with classical models,not only are dynamic models' predictive behavioral errors reduced by49.98% and 30.55% respectively,but dynamic models' learning rates can also reflect the pigeons' internal learning states in different training phases.(3)Statistical analyses are conducted on the characteristics of spike signals of NCL in different training phases.Response signals of effective trials are selected.The appropriate response time window is selected to calculate the discharge rates as the response features of spike signals.Finally,the differences of those features of signals obtained from the process of reinforcement learning are analyzed by Mann-Whitney rank test.The results show that 10/60 neurons contain the information of reward and punishment and 37/60 neurons contain the information of different learning states.The results show that neurons in NCL play different roles in the reinforcement learning process.
Keywords/Search Tags:visual-behavioral decision-making, reinforcement learning, dynamic reinforcement learning model, neuronal response features of NCL
PDF Full Text Request
Related items