Font Size: a A A

Research On The Application Of Speech Emotion Recognition In The Field Of Public Service

Posted on:2021-04-29Degree:MasterType:Thesis
Country:ChinaCandidate:X J TongFull Text:PDF
GTID:2416330629981449Subject:Information Computing and Intelligent Systems
Abstract/Summary:PDF Full Text Request
This research proposes a way to improve the service quality by using computer intelligent algorithm to obtain the real-time emotional state of the parties and give opinions and Suggestions for correcting the negative emotions.Based on the current social service situation,this paper studies and analyzes the influence of language communication on service quality.Starting from the characteristics and development of public services such as medical care,education and employment,the hidden problems in the field of modern public services are explored.Therefore,intelligent speech recognition technology and speech emotion recognition technology are introduced to expand and improve the language communication capabilities of public services,reduce communication costs,and reduce the communication barriers such as the impact of negative emotions during communication to help and improve the working environment of relevant practitioners and improve their Work efficiency to achieve the level of high-quality intelligent services.This study uses the Chinese speech emotion corpus as a speech recognition emotion database,including six emotions: angry,sad,fear,surprise,happy,and neutral Different pronunciations.First,use Mel-scale Frequency Cepstral Coefficients(MFCC)to extract acoustic features.First pre-emphasize the speaker's speech,that is,the signal is passed through a high-pass filter at the beginning of audio transmission to enhance the high-frequency part of the signal,and the cepstrum of the channel is separated,and then the corresponding inverse transform is performed Thus the log spectrum of the function is obtained,from which the formant can be calculated.Then frame and window the audio according to certain rules,and use fast Fourier transform(FFT)to process the framed and windowed frame signals to obtain the FFT spectrum of each frame,and according to each frame spectrum The time sequence is arranged to obtain a distribution diagram including the relationship between time,frequency and energy.Together with the formant,there are a total of four influence factors.The purpose is to more intuitively find the changes in speech signal and time frequency.The Mel spectrum is obtained from the above-mentioned spectrum through the Mel filter bank,and the cepstrum analysis is performed on the Mel spectrum to obtain MFCC as a voice feature.Use the Teager Energy Operator(TEO)signal analysis algorithm to detect stress and mood changes,use the Hidden Markov Model to model the spectrogram,and then compare the synthesized speech and natural speech To achieve the effect of training and evaluation.Finally,the S-fold cross-validation method is used to randomly divide the speech library into N sets that have no intersection with each other,and the size of each set should be equal,and then extract N-1 sets from the equal sets as training data sets.Train the model,and finally use the remaining data as the test data set.Iteratively executes N possible choices,and finally selects the model with the smallest average test error in N evaluations.
Keywords/Search Tags:Public Service, Speech Emotion Recognition, Acoustic Feature Extraction, Spectrum Analysis, Intelligent Service
PDF Full Text Request
Related items