Font Size: a A A

The Research On Vocal Tract Spectrum And Pitch Frequency Transformation In Voice Conversion

Posted on:2014-01-17Degree:MasterType:Thesis
Country:ChinaCandidate:W C JieFull Text:PDF
GTID:2248330395983802Subject:Signal and Information Processing
Abstract/Summary:PDF Full Text Request
Voice Conversion is a technique used in order to change the personality characteristics of asource speaker’s voice into the target speaker’s, while preserving the original semantic information.This paper mainly studied the vocal tract spectrum parameters and pitch frequency in VioceConversion system and some works are carried out as follows:Firstly, the transformation of pitch frequency is studied. Because the conventional methodusually converts the pitch frequency and the spectrum parameters separately, which makes the pitchfrequency lose some part of the speaker’s personality characteristics. So that a pitch frequency jointvocal tract spectral envelope conversion method based on RBF network is proposed. The proposedmethod can establish relationship between the pitch frequency and spectrum parameters, whichmakes the converted pitch frequency track target one better and contain more target details.Secondly, the conversion of vocal tract spectrum is studied in this paper. As we know theGaussian Mixture Model (GMM) makes the converted spectral envelope over smooth, and theCodebook makes it discontinuous. In order to get better spectral envelope, a vocal tract spectrumparameters conversion method based on Codebook improved GMM is proposed. It can combinethese two methods and takes the advantages of them while overcoming the shortcomings, makingthe converted spectrum envelope closer to the target one.Thirdly, the conversion of vocal tract spectrum is further studied. The conventional systemusually has only one conversion rule, and the single rule can not describe the mapping functionaccurately. Moreover, the EM algorithm often causes the models’ parameters trapped in the localoptimum. In order to overcome these shortcomings, a method of Voice Conversion based on selforganization clustering and modified Particle Swarm Optimization (PSO) is proposed. This methodcan establish a conversion system which has multiple rules by self organization clustering, anddetermine the parameters of GMM in each cluster by modified PSO. So it can make the parametermapping more accurately and improve the target orientation degree of the synthesized speech.
Keywords/Search Tags:Voice Conversion, Vocal Tract Spectrum Conversion, Pitch Frequency Transformation, Radial Basis Function Network, Codebook, Gaussian Mixture Model, Self Organization Clustering, Particle Swarm Optimization
PDF Full Text Request
Related items