Font Size: a A A

Research On Key Technologies Of Video Emotion Analysis

Posted on:2020-06-14Degree:MasterType:Thesis
Country:ChinaCandidate:M P LiuFull Text:PDF
GTID:2428330620456161Subject:Information and Communication Engineering
Abstract/Summary:
Emotional recognition,as an important part of human-computer interaction,has attracted more and more attention of researchers.Researching on emotional recognition system to enable computers to automatically recognize people's emotions is of great significance in the fields of human-computer interaction,criminal investigation and judicial,intelligent vehicle system and so on.People usually express their emotions through the changes of facial expression,voice and body posture.Therefore,how to integrate many modal emotional recognition algorithms has a great impetus to the research of emotional recognition.This paper mainly studies face expression recognition algorithm based on deep learning,speech emotion recognition algorithm based on machine learning and deep learning,and multi-modal emotion recognition algorithm based on face expression and speech.The main contents of this paper are as follows:(1)Learning the research background and significance of emotional recognition.Starting with feature extraction and classification recognition,this paper summarizes the development and current situation of facial expression recognition and speech emotion recognition,and finally introduces the related research results on multi-modal emotion fusion at home and abroad.(2)Introduce the database of facial expression recognition.The image preprocessing algorithm and the feature operators commonly used in facial expression are described in detail.Then the convolution neural network and cyclic neural network are introduced.The 3D convolution neural network used for video analysis is introduced in detail.The 3D CNN can extract not only the spatial features of a single image,but also the temporal information in the image sequence.It is widely used in the field of video analysis.In this paper,for the first time,the 3D CNN is used for video facial expression recognition.Finally,the traditional algorithm and deep learning model are tested in the mainstream face database FER2013,CK + and eNTERFACE'05.The experiments show that the 3D CNN has achieved the best performance in the comparison of various algorithms,and the use of 3D CNN is conducive to the subsequent multi-modal fusion.(3)Starting with the characteristics of speech emotion recognition,the characteristics of shortterm energy,short-term zero-crossing rate,resonance peak and pitch frequency are introduced,and the related knowledge of spectrogram is introduced in detail.Secondly,it reviews speech emotion recognition algorithms,mainly PCA,SVM,soft Max regression and decision tree algorithms in machine learning.Finally,the speech emotion database CASIA and eNTERFACE'05 are introduced.Finally,the experimental comparison of the related algorithms on the database shows that the algorithm based on spectral graph and convolution neural network achieves the best performance.At the same time,the SVM algorithm after grid search also achieves good results.(4)The feature level fusion and decision level fusion algorithms are described in detail,with emphasis on the D-S evidence theory and emotion recognition system in the mainstream data state.In the future,with the expansion of multi-modal emotion data sets and the deepening of related research,multi-modal emotion recognition technology will be further developed.
Keywords/Search Tags:Multi-modal Emotional Recognition, Deep Learning, 3D CNN, Spectrogram, D-S Evidence Theory
Related items