Phonemes Associated Multilingual Speech Fusion

Posted on:2014-11-27

Degree:Master

Type:Thesis

Country:China

Candidate:G W Sun

Full Text:PDF

GTID:2268330401490546

Subject:Computer Science and Technology

Abstract/Summary:

PDF Full Text Request

With increasingly close international exchanges, the monolingual speechenvironment cannot meet the demand, integrated multi-lingual speech environment ona smart device become a trend. Embedded devices have limited storage capacity, butmulti-lingual voice data occupies very large storage space. There are a wide range ofassociated features of voice primitive data within the same language and amongdifferent languages, single-language coding method can not eliminate this type of dataredundancy, so explore voice data structure characteristics of alien language, optimizethe multilingual speech data storage forms has important practical significance.Multilingual processing technology is research focus, such as multilingualmachine translation, multilingual speech recognition and synthesis, speaker Languageadaptive. In multilingual processing, phoneme is often used as primitives to establishlinks between the different languages.The number of phonemes is limited inlinguistics, there are clear distinctions between phonemes, and the rhythm rules,spelling rules, phoneme combination rules, and other research results of languagescan be used as a priori knowledge in speech processing.Heterogeneous text-voice fusion research in this article was intended formulti-language text writing intelligent tutoring system,which is mainly used online oroffline counseling for beginners, such as school-age children, including theirstandards of writing guide, writing assessment and guidance, word pronunciation,writing evaluation feedback, and an integral part of the integrated voice guidance cuesinformation is essential. As it involved huge voice data,includeing the guidancevoice,the characters pronunciation and so on,the contradictionthe between the amountof storage space and voice data storage became an important problem to be solved.Somultilingual speech data fusion coding method was proposed according to thephoneme data correlation properties, which exist among heterogeneous languages andamong different words in the same language. Voice sample sequences of the samephoneme data segment in different languages was intercepted according to thesegment templates, wavelet transform was done to those sequences, then featurevectors was extracted to generate shared template sets. Speech data of any word orsentence was coded or decoded according to the template sets. The speech recorddatabase made up of template phoneme sets was indexed according to (syllable,phoneme) structure. The single word compression ratio, speech data size, segmental signal-to-noise ratio (SNRS) and score of subjective evaluation (MOS) weresignificantly better than existing methods, also the voice restoration was of goodquality.The innovations of this paper are:First, research large amounts of linguistics data, from the perspective ofinter-disciplinary, add linguistics research into the area of speech coding, trying toexplore voice compression coding in a new direction, propose the fusion mechanismbased on the phoneme data associated within multilingual voice.Second, establish the (Syllables, phonemes) two retrieval structure, optimize therecord data storage in speech database, dramatically reducing the data storagecapacity of the speech database.Third, design an effective and objective syllables, phonemes automaticsegmentation mechanism, after the voice data pretreatment,achieve the large-scalespeech syllable and phoneme automatic segmentation.Experimental results show that our scheme optimizes the speech database recordsstorage structure, effectively compresses the voice library data storage, and the voicereconstruction results are satisfactory.

Keywords/Search Tags:

speech, phoneme, correlation, multilingual, fusion coding

PDF Full Text Request

Related items

1	Research On Key Technologies For Building Multilingual TTS Text Corpus
2	Research On Speech Phoneme Recognition Based On Deep Learning
3	Independent Component Of The Chinese-based Phoneme Spectrum Analysis And Comparison Of Speech Synthesis Research
4	The Research Of Speech Coding Identification Technique In Communications System
5	Research On Multilingual Text Recognition In Complex Scenes And System Design
6	Improved phoneme-based myoelectric speech recognition
7	A Study On Low-resource Multilingual Speech Recognition Based On Transfer Learning
8	Information processing and semantic correlation in multimodal, multilingual, human computer interfaces
9	End-to-End Speech Synthesis Based On Multi-Language Modeling
10	Research And Implementation Of Speech Intelligibility Evaluation Method Based On Phoneme