Polyphone Research In The System Of Text-to-speech

Posted on:2011-08-08

Degree:Master

Type:Thesis

Country:China

Candidate:Q Li

Full Text:PDF

GTID:2155360308454317

Subject:Chinese Philology

Abstract/Summary:

PDF Full Text Request

Computer Text-to-Speech Technology (Text-to-Speech, referred to as TTS) refers to the use of computer programs to established visual text into speech process. Because of the system include linguistics, phonetics, computer programming, digital signal processing and other fields, it is a comprehensive multi-disciplinary technical projects in many fields.We touch input method, Library, various scheduling problems, the use of Chinese language teaching software, all kinds of electronic products to improve voice reading function to automatically introduce or reply system and the blind development of supplies and children's toys, and even robots manufacturing and future voice-control system to achieve in all areas,which are inseparable from the technical means. As a highly theoretical and practical combination of technology,TTS received the great attention from various disciplines and scholars at the beginning of language translation technology in the production.How to improve TTS's speech synthesis fluency, naturalness and accuracy of the technology have become the focus of attention. Among them, the Chinese pronunciation of polyphonic accuracy of automatic tagging has become one of the difficulties language translation technology in the system of TTS.Object of this paper is to determine the "Modern Chinese Dictionary" (5th Edition) (hereinafter referred to as "Modern Chinese ") in 921 polyphones and pronunciation items in the CCL of modern Chinese corpus, the Pragmatic frequency to frequency-based word, and then from the perspective of linguistic theory, a new idea comes up for the TTS system of polyphone solution to the problem.Article mainly includes three parts, in the first part,according to the "Modern Chinese" in the polyphonic character of the number of statistical, I got 921 polyphones as the object of study, each polyphones's part of speech and the number of polyphonic words were Statisted. The second part, in CCL modern Chinese corpus on these 921 polyphones's frequency and frequency statistics were pragmatic. According to statistics, the cumulative frequency of the calculation results and the final separation of these words pronunciation, high frequency and low frequency levels. On the frequency of each word-class polyphones frequency of use of the statistical items, separate the regular pronunciation, second pronunciation, the pronunciation of three rare audio level.In the corpus only 1% of the low-frequency polyphones pronunciation using the default method of constant handling.The third part, the high-frequency words were classified more sound, and the integrated use of multi-syllable words of elimination, parts of speech determine the law and with common polyphones thesaurus other methods,which need processing. The tone of those items very pragmatic frequency, part of speech can not distinctive of a separate polyphones, then using statistical methods build rules for polyphones, according to different types separately.

Keywords/Search Tags:

Polyphone, Text-to-Speech, Corpus, Word- frequency, Pronunciation item

PDF Full Text Request

Related items

1	Analysis On The Output Of Polyphone In Spontaneous Speech By High-level L2 Learners
2	A Quantitative Study On Part-of-speech Frequency Of Polyfunctional Words Based On British National Corpus
3	Study On The Mastering Of The Relationship Between Pronunciation And Meaning Of Polyphone Of Middle And High Level Chinese Learners
4	Research On The Attributes Of High Frequency Modern Chinese Morpheme Items
5	Corpus-based Study Of Low-frequency Words In Contemporary English
6	A Study On The Teacher's Word Book Of 30, 000 Words And General Service List
7	The Analysis Of Party Document Word Frequency Based On The Corpus In2012
8	The Development Of The Hot Issues News Corpus And Word Study
9	A Study On The Phonetic Notes Of QunJingYinBian
10	Research And Implementation Of Automatic Labeling System For Quasi Writtern Language Korean Speech Corpus