Research On Neural Network Based Tibetan Speech Synthesis Technique

Posted on:2020-10-27

Degree:Master

Type:Thesis

Country:China

Candidate:G C Du

Full Text:PDF

GTID:2415330578964433

Subject:Computer application technology

Abstract/Summary:

PDF Full Text Request

Speech synthesis is one of the core technologies in human-computer interaction research,and which is also a cutting-edge technology in the field of information processing.Speech synthesis is aiming to transform text sequence into clear,natural and fluent vocal information in real time.It’s research has very important theoretical significance and practical value for the development of human-machine voice communication,intelligent robots and automatic voice broadcasting.With the rapid development of computer and multimedia technology,speech synthesis technology has drastically attracted attentions in the fields.Especially in recent years,the successful application of neural network in machine translation,text categorization,question answering system,information extraction and speech recognition makes neural netwok-based speech synthesis technology gradually become a research hotspot worldwide.Tibetan speech synthesis is one of the important research tasks of Tibetan Information Processing.However,compared with Chinese and English,the research of Tibetan speech synthesis technology is still in the developing stage.At present,the implementation of Tibetan speech synthesis system mainly uses waveform splicing technology and statistical parameter speech synthesis technology based on HMM model.Considering that waveform splicing technology requires high storage capacity and long system construction period,and the prosodic performance of synthetic speech based on statistical parameter speech synthesis technology is not satisfactory,this paper presented a Tibetan speech synthesis technology based on neural network by analyzing the structural characteristics and spelling rules of Tibetan,using Seq2 Seq model and attention mechanism.This paper mainly studied Tibetan speech synthesis technology from the following three aspects:(1)Starting from the front end of the speech synthesis system,the structure and spelling rules of Tibetan characters were analyzed based on the traditional Tibetan language method,and the Tibetan component decomposition algorithm was presented.At the same time,the Seq2 Seq model based on attention mechanism was used to predict the prosody of Tibetan texts.(2)Starting from the back end of the speech synthesis system,an acoustic model of Tibetan speech synthesis was designed based on Seq2 Seq model,with emphasis on the research of encoders and decoders for Tibetan speech synthesis.Finally,the Tibetan speech waveform was generated by Griffin-Lim algorithm.(3)By comparing the performance of the corpus-based Tibetan speech synthesis system with that of the neural network-based Tibetan speech synthesis system,the effectiveness of the proposed method is verified.The experiments indicated that the Tibetan speech synthesis system based on neural network can achieve better performance under the condition of large-scale corpus.

Keywords/Search Tags:

Tibetan Speech Synthesis, Word Embedding, Prosody Prediction, Neural Networks, Attention Mechanism

PDF Full Text Request

Related items

1	Text Analysis Of Speech Synthesis Based On Statistical Parameters Of Tibetan Language In Specific Fields
2	Classical Chinese Poetry Generation Research Based On Neural Networks
3	Research On Tibetan Word Segmentation And Part-of-speech Tagging Based On GNN
4	Research And Implementation Of Sequence To Sequence Tibetan Lhasa Dialect Speech Synthesis
5	Research On Chinese Punctuation Prediction In Plain Text Scenarios
6	Research On Generalized Model Of Chinese Couplet Based On Recurrent Neural Networks
7	Cognitive Mechanism Research Of Speech Emotional Prosody In Lhasa Tibetan
8	Research On The Mongolian Speech Synthesis Based On Prosody
9	Research On Word Segmentation And Part-of-speech Of Tibetan On Neural Network
10	Research On Movie Recommendation Algorithm Based On Convolutional Block Attention Module-Convolutional Neural Networks Model