Font Size: a A A

Research On Voice Conversion From Tibetan Amdo To U-tsang Dialect Based On Deep Learning

Posted on:2021-02-08Degree:MasterType:Thesis
Country:ChinaCandidate:X T XingFull Text:PDF
GTID:2415330623982076Subject:Intelligent information processing
Abstract/Summary:PDF Full Text Request
There are great differences in pronunciation among Tibetan dialects,which makes it difficult for people in different dialects to communicate face to face.In recent years,great progress has been made in the voice conversion of Chinese and English.However,Tibetan voice conversion technology is still in its early stage.At present,there is only one implementation of Tibetan voice conversion based on the five-degree tone model.This method merely uses the parametric method to modify the pitch curve directly,and the converted sound quality is poor.The deep neural network(DNN)is used to complete the voice conversion from Amdo dialect to U-tsang dialect by using parallel and non-parallel corpus respectively.The main research work and innovations are as follows:Firstly,linguistic differences between dialects are analyzed to design parallel and non-parallel corpus respectively.Secondly,the voice conversion from Amdo dialect to U-tsang dialect is realized by using parallel corpus.In the training stage,the acoustic parameters are extracted to train conversion model through the use of DNN.In the conversion stage,the model is used to convert the acoustic parameters of Amdo dialect into that of U-tsang dialect.Then,the U-tsang speech can be synthesized by using vocoder.Furthermore,the voice conversion from Amdo dialect to U-tsang dialect is realized by using non-parallel corpus method.According to the different pronunciation of the two dialects,the pronunciation mapping table is designed.The pronunciation dictionary in the recognition stage and the context-related labels in the synthesis stage are designed according to the pronunciation mapping table.In this method,DNN is used as a network model for speech recognition of Amdo dialect and speech synthesis of U-tsang dialect.Finally,the naturalness,intelligibility and similarity of converted sentences are evaluated.The experimental results show that the non-parallel corpus method is better than the parallel one.
Keywords/Search Tags:Amdo dialect, U-tsang dialect, Voice conversion, Parallel corpus, Non-parallel corpus
PDF Full Text Request
Related items