Text Analysis Of Speech Synthesis Based On Statistical Parameters Of Tibetan Language In Specific Fields

Posted on:2021-05-11

Degree:Master

Type:Thesis

Country:China

Candidate:L L Wang

Full Text:PDF

GTID:2415330623482073

Subject:Circuits and Systems

Abstract/Summary:

PDF Full Text Request

China is a country composed of 56 ethnic minorities with a Tibetan population of 385800.However,the study of Tibetan is relatively late,especially the lack of relevant text analysis in the aspect of statistical parameters of Tibetan speech synthesis.Moreover,there is a lack of research on the text analysis of Tibetan speech synthesis in specific fields.Therefore,this thesis improves the word and part of speech layer and classification statement layer in the existing text analysis of speech synthesis based on statistical parameters of Tibetan language in specific fields,especially adds the text classification information in the statement layer.Then,we use this text analysis to carry out the Tibetan speech synthesis and obtain a better synthesis effect.The related work and innovation of this thesis are as follows:Firstly,we propose three Tibetan word segmentation models to obtain word boundaries: bi-directional long short-term memory with conditional random field model(BiLSTM_CRF),convolutional neural with network bi-directional long short-term memory with conditional random field model(CNN_BiLSTM_CRF)and sequence to sequence model(Seq2seq).The experimental results show that the CNN_BiLSTM_CRF model is more effective in Tibetan word segmentation,the BiLSTM_CRF model is more accurate in Tibetan part of speech tagging,and the seq2 seq model can carry out Tibetan word segmentation and part of speech tagging at the same time.Secondly,this thesis proposes a Tibetan text classification method based on deep learning methods.Firstly,we segment Tibetan classification text to obtain Tibetan words.Secondly,we construct a word vector space model to get word vectors by removing stop words,calculating word frequency and extracting feature words.Thirdly,the word vectors are transmitted to the classification model to train the Tibetan text classifier.Finally,we use the Tibetan text classifier to classify Tibetan texts.At the same time,we compare the experimental results with the traditional methods.The results show that the fasttext classifier in the depth neural network model has the best effect on text classificationThirdly,we propose text analysis of speech synthesis based on statistical parameters of Tibetan language in specific fields.Firstly,we decompose the Tibetan characters in the synthetic text to get the sound and vowel information of the Tibetan characters to complete the conversion of the characters.Secondly,we segment each syllable to get a single syllable.Thirdly,we segment the Tibetan text and tag the part of speech to get each word and its corresponding part of speech.Fourth,we classify the text and mark each sentence with a classification.Finally,we get a four-layer context sensitive annotation through the context sensitive tagging program,which is based on the vowel information,syllable information,word and part of speech information and sentence classification information obtained from the above steps.The tagging and the problem set of the designed domain specific Tibetan speech synthesis are input into the deep neural network(DNN)model for the domain specific Tibetan speech synthesis.The experimental results show that the naturalness of synthesized speech is better after adding segmentation and part of speech tagging,and the expressive power of synthesized speech is higher after adding text classification.

Keywords/Search Tags:

Tibetan text analysis, Tibetan text classification, Tibetan word segmentation, part of speech tagging of Tibetan, Tibetan speech synthesis, deep neural network

PDF Full Text Request

Related items

1	Tibetan Segmentation And POS Tagging Study
2	Research And Implementation Of The Tibetan Part Of Speech Tagging System
3	Research On Word Segmentation And Part-of-speech Of Tibetan On Neural Network
4	Research On Neural Network Based Tibetan Speech Synthesis Technique
5	Research On Sentiment Classification Technology Of Tibetan Text
6	Tibetan Lhasa Acoustic Model Based On LSTM-CTC Speech Recognition System
7	Research On Tibetan Speech Recognition Based On Deep Convolutional Neural Network
8	Research On Automatic Notation Of Word For Tibetan Corpus Based On HMM
9	Research On Automatic Notation Of Word For Tibetan Corpus Based On Hmm
10	Research And Implementation Of Sequence To Sequence Tibetan Lhasa Dialect Speech Synthesis