Font Size: a A A

Study On Feature Extraction And Recognition For Speech Emotion

Posted on:2011-05-18Degree:DoctorType:Dissertation
Country:ChinaCandidate:L S ZhaoFull Text:PDF
GTID:1102360332956988Subject:Mechanical design and theory
Abstract/Summary:PDF Full Text Request
Speech emotion recognition aims at automatically identifying the emotional or physical state of a human being from his or her voice. It is a multi-disciplinary intersection of psychology, phonetics, digital signals processing, artificial intelligence and so on, and has attracted more and more attention from scholars. On the one hand, the study on speech emotion recognition can promote the development of related disciplines. On the other hand, with the development of the technology, it has been widely used in many areas such as entertainment, forensic detection, medical field and the service. Therefore, the research for speech emotion recognition has important theoretical significance and application value.Though the speech emotion recognition technology currently has made much progress in theory and application, it still needs further research because of the complexity of speech signal and the restriction of correlative subject development. To establish a text independent speech emotion recognition system, this paper focuses on the feature extraction and recognition model. The main contents of this thesis are as follows:(1) A pitch period detection method based on variance analysis is presented. On the basis of the research of variance analysis, this paper gives the principle and flowchart of the method. First, the short-time speech sampled sequence is performed variance analysis to obtain its variance distribution function, and then we locate the position of the maximum of variance distribution function to achieve the detection of speech pitch period.(2) A robust pitch period estimation algorithm with the combination of wavelet transform and variance analysis is examined. First, speech is decomposed by wavelet transform, then wavelet transform coefficients of high frequency band of speech are discarded for filtering out noise, and the wavelet parameters of the speech in the fundamental frequency band are selected as the time series of variance analysis to evaluate the pitch period of noisy speech. Simulation results show that the above algorithm has a higher pitch extraction accuracy and robustness.(3) A self-adjusting weights K-nearest neighbor model is proposed for speech emotion recognition. On the basis of the traditional K-nearest neighbor model and the existing improved algorithms related to it, a new weighting K-nearest neighbor model is presented. The new model calculates synchronously the within-class weights and the between-class weights of the distances between test sample and the K-nearest neighbors in each class. Furthermore, two types of weights are self-adaptively assigned based on these distances. The proposed algorithm is applied into speech emotion recognition system. The system adopts the global statistical parameters as emotion features, and uses principal component analysis to reduce the dimension of the above features. Simulation results demonstrate the effectivity of the method.(4) A mixed model for speech emotion recognition is put forward. First, the different classification feature set of speech emotion are extracted respectively according to the different speech model, and then individual classifier is designed for each feature set by using of Gaussian mixture model. Finally, speech emotion is recognised by the combined classifier based on genetic algorithm. Experimental results of speech emotion recognition show that the new method can obtain much better performance than the individual classifier.(5) Based on component-based architecture and the proposed algorithms with the existing methods, the speech emotion recognition prototype system is developed. It provides the foundation for the design and development of application-level emotional speech processing software system.
Keywords/Search Tags:Speech Emotion Recognition, Feature Extraction, Wavelet Transform, Variance Analysis, Nearest Neighbor, Mixed Model
PDF Full Text Request
Related items