Font Size: a A A

Phonemes Associated Multilingual Speech Fusion

Posted on:2014-11-27Degree:MasterType:Thesis
Country:ChinaCandidate:G W SunFull Text:PDF
GTID:2268330401490546Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
With increasingly close international exchanges, the monolingual speechenvironment cannot meet the demand, integrated multi-lingual speech environment ona smart device become a trend. Embedded devices have limited storage capacity, butmulti-lingual voice data occupies very large storage space. There are a wide range ofassociated features of voice primitive data within the same language and amongdifferent languages, single-language coding method can not eliminate this type of dataredundancy, so explore voice data structure characteristics of alien language, optimizethe multilingual speech data storage forms has important practical significance.Multilingual processing technology is research focus, such as multilingualmachine translation, multilingual speech recognition and synthesis, speaker Languageadaptive. In multilingual processing, phoneme is often used as primitives to establishlinks between the different languages.The number of phonemes is limited inlinguistics, there are clear distinctions between phonemes, and the rhythm rules,spelling rules, phoneme combination rules, and other research results of languagescan be used as a priori knowledge in speech processing.Heterogeneous text-voice fusion research in this article was intended formulti-language text writing intelligent tutoring system,which is mainly used online oroffline counseling for beginners, such as school-age children, including theirstandards of writing guide, writing assessment and guidance, word pronunciation,writing evaluation feedback, and an integral part of the integrated voice guidance cuesinformation is essential. As it involved huge voice data,includeing the guidancevoice,the characters pronunciation and so on,the contradictionthe between the amountof storage space and voice data storage became an important problem to be solved.Somultilingual speech data fusion coding method was proposed according to thephoneme data correlation properties, which exist among heterogeneous languages andamong different words in the same language. Voice sample sequences of the samephoneme data segment in different languages was intercepted according to thesegment templates, wavelet transform was done to those sequences, then featurevectors was extracted to generate shared template sets. Speech data of any word orsentence was coded or decoded according to the template sets. The speech recorddatabase made up of template phoneme sets was indexed according to (syllable,phoneme) structure. The single word compression ratio, speech data size, segmental signal-to-noise ratio (SNRS) and score of subjective evaluation (MOS) weresignificantly better than existing methods, also the voice restoration was of goodquality.The innovations of this paper are:First, research large amounts of linguistics data, from the perspective ofinter-disciplinary, add linguistics research into the area of speech coding, trying toexplore voice compression coding in a new direction, propose the fusion mechanismbased on the phoneme data associated within multilingual voice.Second, establish the (Syllables, phonemes) two retrieval structure, optimize therecord data storage in speech database, dramatically reducing the data storagecapacity of the speech database.Third, design an effective and objective syllables, phonemes automaticsegmentation mechanism, after the voice data pretreatment,achieve the large-scalespeech syllable and phoneme automatic segmentation.Experimental results show that our scheme optimizes the speech database recordsstorage structure, effectively compresses the voice library data storage, and the voicereconstruction results are satisfactory.
Keywords/Search Tags:speech, phoneme, correlation, multilingual, fusion coding
PDF Full Text Request
Related items