Font Size: a A A

Mongolian Speech Conversion System Based On RBF-GMM

Posted on:2022-04-22Degree:MasterType:Thesis
Country:ChinaCandidate:M K HuFull Text:PDF
GTID:2518306509954629Subject:Computer technology
Abstract/Summary:PDF Full Text Request
Mongolian speech conversion is an important branch of Mongolian information processing.With the development of personalized speech synthesis technology and the diversity of human-computer interaction,the development of voice conversion technology has attracted more and more attention from the academic community.Voice conversion technology is a special speech synthesis technology which transforms the voice of the source speaker into the target speaker voice under the premise of the invariable speech content.It can be applied to the back end of the speech synthesis system to generate various personalized speech synthesis effects.In recent years,the speech conversion technology for mainstream languages such as Chinese and English has made great progress,but there is not much achievement in the technology of voice conversion for Mongolian.This paper studies the technology of speech conversion and realizes a Mongolian speech conversion system based on neural network.Firstly,according to the traditional GMM model,this paper uses standard Mongolian language corpus to realize the Mongolian speech conversion experiment.When analyzing the experimental results,it is found that the GMM model will have smooth feature conversion when it transforms features.Therefore,it is found that the traditional GMM model can transform features smoothly The Mongolian voice after the model is transformed will appear fuzzy and dull.The text is improved to solve this problem.The multi-layer RBF neural network is used to improve the feature vector,so that the conversion of speech features will not appear too smooth,and the quality of the converted speech has also been improved.Finally,according to the above improved method,a Mongolian speech conversion system is implemented.The system can realize voice conversion between speakers,which has the functions of volume adjustment,tentative imitation,downloading target voice,etc.
Keywords/Search Tags:Mongolian, Voice conversion, GMM, RBF neural network, Deep learning
PDF Full Text Request
Related items