Font Size: a A A

Study On Fusion Of Audio-visual Information Based On The Deep Learning

Posted on:2017-04-30Degree:MasterType:Thesis
Country:ChinaCandidate:Q Q GaoFull Text:PDF
GTID:2428330596457388Subject:Engineering
Abstract/Summary:PDF Full Text Request
The brain is the most complex information processing system in nature,which also has the highest efficiency of information processing.The processing mechanism of various kinds of information is the main content of brain cognitive science research.Brain processing of information can not be achieved through a neuron or a brain area,but through the interaction between multiple brain regions in the brain can achieve the integration and processing of multi-channel information.Therefore,the brain is an optimized multi sensory channel information fusion system,which automatically integrates the information of different channels to form the perception of the external world.This paper focuses on the cognitive mechanism of visual and auditory information processing and integration mechanism,simulate the model and mechanism of simulation of the brain visual and audio information,and visual,auditory and visual information fusion processing.This paper first introduces the brain auditory information on visual learning and memory neural mechanisms,followed by the introduction of deep learning network model,and focuses on the depth of the belief network and fine-tuning algorithm,finally realized the depth of audio-visual information fusion belief network learning.The main innovations of this paper include the following two aspects: First,according to the stratification of visual and auditory information processing mechanism of human brain and partition,the depth of the neural network multilayer structure,constructs the audiovisual information processing model is proposed to simulate the human brain with deep belief network model of the information integration process,learning of audiovisual information fusion.Second,fine tune the stage of deep belief network optimization,the traditional deep belief network by BP algorithm or biological clock algorithm to fine tune the whole network,but the two kinds of trimming algorithms are time-consuming.In this paper,the biological clock algorithm and BP algorithm are used to fine tune the depth belief network,which can effectively shorten the fine tuning time,and improve the recognition rate of the network to a certain extent.Using the above method simulates the brain audio-visual information learning,and this method is used for lip reading.It shows that the deep structure of the brain processing information can be used in the deep structure of the brain processing information,which can effectively make up the phenomenon of low speech recognition rate in high noise environment.
Keywords/Search Tags:cognition, deep learning, audio visual information fusion, deep belief network
PDF Full Text Request
Related items