Font Size: a A A

Thai Text To Speech System Text Analysis And Processing

Posted on:2015-02-01Degree:MasterType:Thesis
Country:ChinaCandidate:X E LinFull Text:PDF
GTID:2268330431967525Subject:Signal and Information Processing
Abstract/Summary:PDF Full Text Request
Speech synthesis is the process which is used to input text into a computer can understand human speech signal. Speech synthesis and speech recognition is the support technology that is necessary to achieve human-computer voice communication. Text to speech conversion system is an effective way to achieve speech synthesis in this stage, the naturalness of synthetic speech has become a key influence to promote the application of the technology. Text to speech conversion system is divided into front-end and back-end text analysis module speech synthesis module, the result of text analysis and processing will directly determine the effect of naturalness of synthesized speech.In this paper, we aim to study the development of Thai Text to Speech System, we research on the Thai word Segmentation, Normalized and Romanization. The main work includes:1. we build Thai characters sets for Thai features, and apply to the front of their biggest match of the word algorithm. Experimental results show that the material containing the word unregistered word correct rate upgrade by85.69%to94.04%.2. we propose rule-based method of combining words to realize Thai text normalization in the special character processing module, we first classify version numbers, physical units, currency symbols, abbreviations and other special characters that which are appeared on the Thai texts. We generalize the character type which is easy to produce ambiguous meaning, and build keyword dictionary.On this basis, we proceed processing special characters by the C language program and successfully convert them to a standard Thai text. The experimental results showed that: the correct rate of inside test set is97.83%, the correct rate of outside test set is97.12%, and the correct rate of disambiguation most nonstandard term of is more than95%.3. According to the characteristics of Thai syllable structure,we proceed on induction, consolidation of vowels and consonants and vowels in a vowel and consonant with the rules of the tail, on this basis, we build Thai text romanization by the syllable as the basic unit to program in Perl scripting language. Test results show that the results of Romanization meet the requirements of the back-end speech synthesis, and from which embodies word can express text normalized results.
Keywords/Search Tags:Speech synthesis, Text analysis, Thai word segmentation, Normalization, Thai Romanization
PDF Full Text Request
Related items