Study On Feature Extraction And Recognition For Speech Emotion

Posted on:2011-05-18

Degree:Doctor

Type:Dissertation

Country:China

Candidate:L S Zhao

Full Text:PDF

GTID:1102360332956988

Subject:Mechanical design and theory

Abstract/Summary:

PDF Full Text Request

Speech emotion recognition aims at automatically identifying the emotional or physical state of a human being from his or her voice. It is a multi-disciplinary intersection of psychology, phonetics, digital signals processing, artificial intelligence and so on, and has attracted more and more attention from scholars. On the one hand, the study on speech emotion recognition can promote the development of related disciplines. On the other hand, with the development of the technology, it has been widely used in many areas such as entertainment, forensic detection, medical field and the service. Therefore, the research for speech emotion recognition has important theoretical significance and application value.Though the speech emotion recognition technology currently has made much progress in theory and application, it still needs further research because of the complexity of speech signal and the restriction of correlative subject development. To establish a text independent speech emotion recognition system, this paper focuses on the feature extraction and recognition model. The main contents of this thesis are as follows:(1) A pitch period detection method based on variance analysis is presented. On the basis of the research of variance analysis, this paper gives the principle and flowchart of the method. First, the short-time speech sampled sequence is performed variance analysis to obtain its variance distribution function, and then we locate the position of the maximum of variance distribution function to achieve the detection of speech pitch period.(2) A robust pitch period estimation algorithm with the combination of wavelet transform and variance analysis is examined. First, speech is decomposed by wavelet transform, then wavelet transform coefficients of high frequency band of speech are discarded for filtering out noise, and the wavelet parameters of the speech in the fundamental frequency band are selected as the time series of variance analysis to evaluate the pitch period of noisy speech. Simulation results show that the above algorithm has a higher pitch extraction accuracy and robustness.(3) A self-adjusting weights K-nearest neighbor model is proposed for speech emotion recognition. On the basis of the traditional K-nearest neighbor model and the existing improved algorithms related to it, a new weighting K-nearest neighbor model is presented. The new model calculates synchronously the within-class weights and the between-class weights of the distances between test sample and the K-nearest neighbors in each class. Furthermore, two types of weights are self-adaptively assigned based on these distances. The proposed algorithm is applied into speech emotion recognition system. The system adopts the global statistical parameters as emotion features, and uses principal component analysis to reduce the dimension of the above features. Simulation results demonstrate the effectivity of the method.(4) A mixed model for speech emotion recognition is put forward. First, the different classification feature set of speech emotion are extracted respectively according to the different speech model, and then individual classifier is designed for each feature set by using of Gaussian mixture model. Finally, speech emotion is recognised by the combined classifier based on genetic algorithm. Experimental results of speech emotion recognition show that the new method can obtain much better performance than the individual classifier.(5) Based on component-based architecture and the proposed algorithms with the existing methods, the speech emotion recognition prototype system is developed. It provides the foundation for the design and development of application-level emotional speech processing software system.

Keywords/Search Tags:

Speech Emotion Recognition, Feature Extraction, Wavelet Transform, Variance Analysis, Nearest Neighbor, Mixed Model

PDF Full Text Request

Related items

1	Research On Driver Emotion Recognition Based On Speech
2	Research On Critical Technologies Of Speech Recognition In Vehicular Noise Environment
3	Research On The Safety Warning Model Of Power Grid Dispatching Operation Based On Speech Emotion Recognition
4	Experimental Data Analysis Of Acoustic Emission Signal Of Axle Fatigue Crack Based On Wavelet Transform And DBN
5	Research On Approximate Nearest Neighbor Query Method For Key Ship Target Recognition
6	Research Of Driver Emotion Recognition Under Partial Occlusion
7	Semantic Emotion Analysis For Immersive Intelligent Living Room Model Research
8	Based On Deep Learning And Physiological Signals Mixed Emotion Recognition Method
9	Research On Driver Road Rage Emotion Recognition Based On Multi-feature Fusion
10	Development Of A New Approach For Damage Identification In Structural Health Monitoring Using A Novel Wavelet Selection Method For Feature Extraction Combined With Artificial Neural Network