Research On Speech Emotion Recognition Technology Based On Context Feature Fusion

Posted on:2024-08-21

Degree:Master

Type:Thesis

Country:China

Candidate:H L Liu

Full Text:PDF

GTID:2558307103969329

Subject:Electronic information

Abstract/Summary:

PDF Full Text Request

It is an important part of human-computer interaction to entrust computers with human emotions and enable computers to express emotions like human beings.Speech is the main way of emotional expression.Emotion recognition based on speech signal has been widely concerned.More questions have been raised with the deepening of the research on speech emotion recognition.Among them,the contextual information between contexts contains a large amount of emotion-related information.Obviously,the use of emotional correlation between contexts can improve the effect of emotion recognition.Based on this,this paper puts forward two different methods of using contextual information and combines the two methods to jointly apply to speech emotion recognition.Firstly,a speech emotion recognition method based on context features is proposed.This method extracts emotion features,namely context features,from historical statements through the IFCN＿LSTM cascade network,refines the precision of spatial features and relates the sequence of time sequences,and then splines the context features with the current target statements.Identify emotional states with statements that contain contextual features.To extract context features,the IFCN network extracts spatial structure features,removes redundant features,refines spatial precision,and ensures that it has the same time sequence as the input state.LSTM network is used to calculate the time-dependent relationship of time-domain features,correlate the correlation between emotion features of time series,and calculate the sequence of emotional changes.By extracting the context features of the historical statements,the historical emotion state is transferred to the target statement,which can improve the recognition accuracy by using the context features by associating the context features highlighted by the emotions in the historical statements.Secondly,a context feature fusion method based on the attention mechanism is proposed.The attention mechanism is used to calculate the weight of the context statement in each period of the current target statement and describe the correlation degree between the historical emotional state and the current emotional state so that the emotion-related information in the context statement can be learned and integrated into the current target statement so that the network can focus more on the emotionally prominent part of the current statement.Context features contain more fine granularity of emotion.This kind of attention fusion mechanism is applied to context feature fusion,and context features are embedded into the current target statement to depict a more accurate emotion correlation between the current emotion state and historical emotion state,and improve the overall effect of emotion recognition.

Keywords/Search Tags:

speech emotion recognition, improved fully convolutional networks, long short-term memory, contextual features, attention mechanism, feature sentence fusion

PDF Full Text Request

Related items

1	Research On Children’s Emotion Recognition Based On The Fusion Of Speech And Text Bimodality
2	Research On Speech Emotion Recognition For The Elderly
3	Speech Emotion Recognition Based On Deep Learning And Multi-Feature Fusion
4	Research On Speech Emotion Recognition Based On Spatiotemporal Feature Fusion
5	Speaker Emotional State Recognition Based On Speech And Text Fusion
6	Speech Emotion Recognition Based On Deep Learning Technology
7	Research Of Speech Emotion Recognition Based On Feature Learning
8	Analysis Of Effective Fused Features And Model Evaluation For Speech Emotion Recognition
9	Research And Application Of Speech Emotion Recognition Algorithm Based On Deep Learning
10	Research On End-to-End Speech Recognition Based On GRU And Self-Attention Mechanism