Font Size: a A A

Emotional Speech Voice Analysis And Synthesis

Posted on:2015-02-07Degree:MasterType:Thesis
Country:ChinaCandidate:X W LiFull Text:PDF
GTID:2268330425996305Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
Sound source plays an important role in the process of emotional voice, and differentemotional voices have different performance on the voice sound quality features. Previousstudies of emotional speech focused on prosodic features and a small amount of sound qualityfeatures. These features have significant difference in the analysis of specific emotionalcategories. However, if you expand emotional categories and try to analyze morecomprehensive ones, you need more complex characteristic parameters. This paper mainlytakes the voice sound quality features of the emotional speech as the research object. Throughextracting and analyzing the voice sound quality parameters, we can establish thecorresponding between parameters and emotion. Therefore, based on the correspondence, wecan adjust the synthetic model’s input and synthesize emotional speech.First of all, we choose the seven emotional categories recognized by most researchers asthe research object, namely sadness, happiness, anger, surprise, fear, disgust and neutral. Then,we extract the voice sound quality parameters of the seven emotional speech samplesrespectively. Altogether, nine parameters about voice sound quality are extracted in this study,namely jitter, shimmer, pulse amp, HNR, MFDR, mean F0, NAQ, pitch range and H1-H2. Inthis way, we can use a variety of methods for statistical analysis of the parameter data. Fromthe parameter statistical analysis results, we found that:(1) There are some parameters with emotional “universality”, i.e., they have significantdifference in most vowels and emotional combinations. For example, in the Kruskal-Wallisrank test containing all the vowels and emotional samples, MFDR exhibited significantdifference among the emotion. Compared with other parameters, MFDR has significantdifference in more emotional combinations during the specific emotional combination test.(2) In the specific emotional combination test, we found that there are some parameters’significant difference is associated with specific vowels and emotion. For example of the vowel/e/, jitter has significant difference in the emotional combinations containing “anger” emotion,but it does not in the emotional combinations without “anger” emotion. Another example isH1-H2, it has no significant difference in the emotional combinations of the vowel/e/, but itdoes in the emotional combinations of the vowel/i/.(3) From the viewpoint of emotion, some emotional combinations are more easily to bedistinguished by the voice sound quality parameters. For example of the vowel/i/, there aremore parameters have significant difference in the emotional combinations of “fear-neutral”,“fear-disgust” and “fear-surprise”. It indicates that these emotional combinations are easier tobe distinguished by the voice sound quality features.By comprehensively analyzing the parameters, we obtained the mapping between theparameters’ typical values and the emotion. Based on the mapping, we can adjust the input ofthe speech synthesis model and synthesize emotional speech with STRAIGHT algorithm.
Keywords/Search Tags:emotional speech, voice sound quality parameter, emotional speech analysis, emotional speech synthesis
PDF Full Text Request
Related items