Font Size: a A A

Research On Tibetan Speech Synthesis Technology Based On Mixed Primitives

Posted on:2017-12-16Degree:DoctorType:Dissertation
Country:ChinaCandidate:R Z M CaiFull Text:PDF
GTID:1318330512971888Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
Although speech synthesis is one of the key techniques of human-computer interaction,it is very difficult in the field of Chinese information processing.The main aim of speech synthesis is to translate the text information to the articulate and fluent speech.Moreover,from both theoretical and practical points of view,the speech synthesis is fundamental and significant for automation,intelligent robotics,human-computer speech interaction system and so on.With the rapid development of computer and communication technology,more attention has been paid to the Corpus-based Speech Synthesis.Tibetan information processing is viewed as one important component of Chinese information processing.In the past over 20 years,Tibetan speech synthesis has just made the first step although it has made great progress in segmentation,tagging and word frequency statistics.At present,many valuable and useful attributes of Tibetan speech synthesis have not been thoroughly explored and depicted,and the study of the ontology of Tibetan language has been still immature.These existing systems can neither do the qualitative and quantitative analysis of Tibetan prosodic,nor provide some necessary control information for the designed systems through textual analysis.Therefore,on the basis of Tibetan language ontology,this thesis studies the characteristics of Tibetan language texts and rhythms from both linguistic and phonetic points of view.Furthermore,in this thesis,we designs and develops a practical Tibetan speech synthesis system based on hybrid unit model.Following results have been obtained in this thesis:(1)In Chapter 2,starting with the text features,we investigate some preprocessing issues of the speech synthesis such as non-Tibetan textual symbols and sentence boundary identification.And in accordance with the practical requirements of Tibetan speech synthesis,the Tibetan word segmentation algorithms are proposed based on speech constraints.Compared with the traditional algorithm,by the means of the constraint rules of part of speech,this algorithm avoids to generate most of crossing and combining ambiguities,and improves the identification strategy of abbreviated words and unknown words.So,the algorithm presented in this thesis makes the efficiency of word segmentation get been improved greatly.In addition,in order to synthesizes unknown words,we present a decomposition algorithm of Tibetan textual components.The performance of the algorithm is verified by the Tibetan component analysis system constructed in this thesis.Meanwhile,based on large scale corpus,the statistical results obtained from this system are used to guide the construction of database and unit selection.(2)In Chapter 3,based on the acoustic and grammatical features,this thesis studies Tibetan prosodic rules by means of the statistical analysis of the prosodic hierarchical structure,stress pattern and intonation phenomenon of Amdo Tibetan.Firstly,we propose a prediction algorithm of Tibetan prosodic hierarchy.This algorithm,using the frequency of function words and the length information of prosodic phrase,marks the boundaries of each prosodic unit dynamically.It is avoided that in the process of algorithm the prosodic structural division excessively depends on the segmentation results,and it is sure that the prosodic hierarchical structure is integrated.Secondly,we calculate the relative coefficient of stress at different levels.In grammatical analysis,we firstly set grammatical stress of prosodic words,prosodic phrases and intonational phrases,and then calculate the emphatic stress of target sentences according to the relative coefficient of rhythm unit stress.Finally,we give the characteristics and rules of statement sentences,interrogative sentences,imperative sentences and exclamative sentences.An experiment shows that these prosodic rules play a crucial role in the rhythm representation of speech,and improve the naturalness of speech.(3)A practical corpus is established on the basis of unit selection,which is the key ingredient of a Corpus-based speech synthesis.Therefore,in Chapter 4,we propose a constructive strategy of so-called hybrid unit corpus,and give the corresponding algorithms.The data,obtained from the subjective and objective experiments,indicates that this strategy and algorithm preserve the integrity of bigger units and the flexibility of smaller units effectively.To avoid excessively adjusting to the units in the synthesis process,this thesis uses a multi-sample waveform concatenation approach based on the hybrid unit corpus.At the same time,the organization strategy and search algorithms for the corpus are discussed.Experimental data shows that,this method can improve the synthesis rate and enhance the real-time performance of the system compared with the traditional method.(4)In Chapter 5,taking the Amdo Tibetan speech synthesis system into account,we introduce the design principle,objective,functional characteristics and performance evaluation of Tibetan speech synthesis system.This system has distinctive characteristics in the text analysis and prosodic control modules.Hence,it can provide a powerful platform for our further study.
Keywords/Search Tags:Tibetan information processing, speech synthesis, unit, component
PDF Full Text Request
Related items