A Research Of Prosody Modeling And Synthesis Method In Chinese TTS

Posted on:2009-09-18

Degree:Master

Type:Thesis

Country:China

Candidate:P G He

Full Text:PDF

GTID:2178360245496445

Subject:Circuits and Systems

Abstract/Summary:

PDF Full Text Request

During the past few decades, with the development of computer and other related subjects, the speech synthesis technique progressed a lot. Nowadays, speech synthesis technique focuses on Text-To-Speech (TTS). TTS is a technique that can convert the input text into speech output. Generally speaking, a TTS system consists of four modules, including Text Analysis, Prosody Control, Speech Synthesis and Unit Database. However, the four modules are not independent. The quality of output speech is impacted greatly by every single module.The estimation to output speech relates to many aspects, but mainly to definition, understandability and naturalness. The definition and understandability of existing TTS systems are satisfactory now, but the overall naturalness still need to be improved. In this thesis, we research Prosody Control and Speech Synthesis these two modules to improve the output speech naturalness.The Prosody Control module greatly impacts the naturalness of the output speech. There are many research subjects in Prosody Control, but we focus on prosody modeling. Prosody model is used to predict the quantitive acoustics parameters according to the high level qualitative prosody information. We design and implement a predictor, which can predict the pitch contour, duration and pause of Chinese syllable. Experiment result shows that this model is accurate enough to predict these parameters.The speech synthesis module builds the final output speech, and generally adopts the waveform concatenation technique. After the selection of optimal units, it also does some modification to the waveform to make the speech more natural. In this paper, an optimal unit selection algorithm and a Fourier based speech spectral modification algorithm are introduced in detail. This modification algorithm not only smoothes the speech spectrum, but also avoid the problem of synthesized speech quality degrading which is caused by traditional algorithm.To verify the performance of algorithms, a simple TTS system is constructed in this paper, which utilizes all the mentioned algorithms. The listening test indicates that the output speech is more natural than previous system to some extent.

Keywords/Search Tags:

Speech Synthesis, ANN, Prosody Modeling, Spectral Modification

PDF Full Text Request

Related items

1	Research On 3D Visible Speech Animation Driven By Prosody Text
2	Research On Speech Synthesis Based On Information Supplement
3	The Research On Dai Prosody Prediction Module Of Speech Synthesis
4	The Research Of Speech Synthesis And Prosody Control In Wu-Dialect Text-to-Speech
5	Research On Voice Prosody Modification For Mobile And Portable Platforms
6	An Improved Speech Synthesis Method
7	Research On Chinese Speech Synthesis Method Integrating Pause And Personal Information
8	Research On Spectral Modeling And Parameter Generation In Statistical Parametric Speech
9	Research On Tibetan Speech Synthesis Based On Deep Learning
10	Key Technologies For Text-to-speech Systems