Font Size: a A A

Emotional Speech Synthesis Based On Optimization Of Prosody Parameters

Posted on:2021-03-05Degree:MasterType:Thesis
Country:ChinaCandidate:X Y CaoFull Text:PDF
GTID:2437330647458008Subject:Education Technology
Abstract/Summary:PDF Full Text Request
Human speech contains rich emotional information.Without emotion,it is impossible to convey the accurate information of the sentence.With the development of computer technology and natural language processing technology,people have a higher requirement of the quality of synthetic speech.As an efficient way to improve the quality of synthetic speech,emotional speech synthesis become more and more import.In the field of computer aided instruction(CAI),emotional speech synthesis can be applied to language teaching and effectively promote the learning of speech.However,the speeches still have some problems,such as lack of naturalness or inaccuracy of emotion expression.Therefore,this paper proposed a method that combines Tacotron model with prosodic features optimization.On the basis of the emotional speech synthesized by end-to-end model,prosodic features are further optimized which obtain natural and more emotional speech,so as to provide conditions for the subsequent application of emotional speech synthesis in bilingual teaching.The main work and innovation of this paper are as follows:1.Established an end-to-end emotional speech synthesis model.Based on the Tacotron model,the five different emotional speech synthesis models of neutral,anger,happiness,disgust and sleepiness were obtained through fine-tuning,and produced natural speech.2.Analyze the prosodic features.The mapping between affective prosodic features and emotional states is established,and the emotional prosodic feature parameters of speech are determined.3.Modify the prosodic features.According to the prosodic parameters of different emotions,the prosodic parameters of the synthesized emotional speech were modified to improve the emotional expression of emotional speech.The result showed that this method can improve the expression of emotion of speech to a certain extent.Although it has a slight decrease in naturalness compared to the unmodified emotional speech,it still has natural performance.
Keywords/Search Tags:Emotional speech synthesis, End-to-End, Prosodic features, Feature modification
PDF Full Text Request
Related items