The Research On Voice Conversion Algorithm Based On Improved Bilinear Frequency Warping For Parallel Or Nonparallel Corpora

Posted on:2018-03-05

Degree:Master

Type:Thesis

Country:China

Candidate:Z L Lv

Full Text:PDF

GTID:2348330536479570

Subject:Signal and Information Processing

Abstract/Summary:

PDF Full Text Request

Within the framework of the speech processing technologies,voice conversion is defined as that allows transforming the voice characteristics of a speaker(the source speaker)to that perceived by listeners as if it has been uttered by another specific speaker(the target speaker)without altering the linguistic message.Although the voice contains abundant information,including semantic information,personality information,language information and emotional information,etc.Voice conversion mainly focuses on the spectral characteristics and the prosodic features.Among the multiple applications of voice conversion,such as the applications of entertainment and cross-lingual transformation field,voice conversion technology can provide high quality converted speech and conduct non-parallel voice conversion.The current voice conversion system is mainly faced with two problems.On the one hand,the converted voice can’t get higher similarity and better sound quality at the same time.Thus,the existing spectral conversion methods show a trade-off between the similarity of conversion achieved and the quality of the converted speech.On the other hand,the training of the conversion function depends on the parallel corpus,which limits the versatility of the voice conversion system.First,in order to achieve higher speech quality and similarity of voice conversion,in this thesis,a bilinear frequency warping plus amplitude scaling algorithm based on adaptive Gaussian classification is proposed,which uses adaptive Gaussian classification to better model the acoustic feature distribution of speech and perform voice conversion on the basis of more reasonable classification.The improved voice conversion method is evaluated by means of objective evaluation and subjective evaluation.The average mean opinion score of the converted speech is increased by 4.7% and the average mel-cepstral distortion is reduced by 2.7% compared with the bilinear frequency warping plus amplitude scaling algorithm with fixed classification.The results indicate that the proposed method improves the performance of voice conversion system.Second,in order to solve the dependence of the voice conversion method on the parallel corpus,this thesis uses the method of unit selection and vocal tract length normalization to align the non-parallel corpus,then,the bilinear frequency warping plus amplitude scaling method based on adaptive Gaussian classification is applied to non-parallel corpora voice conversion.The comparison between the subjective and objective evaluation experiments shows that the average mean opinion score of the converted speech is increased by 4.0% and the average mel-cepstral distortion is reduced by 7.1% compared with the non-parallel corpora INCA method,this indicates that the converted speech has higher quality and the better similarity.Compared with the traditional Gaussian mixture model,the average mel-cepstral distortion is 5.1% higher and the average mean opinion score is 3.9% lower than that of the traditional Gaussian mixed model voice conversion method,which indicates that there is still a certain gap in the conversion performance.However,this method is developed in non-parallel corpora conditions,with greater versatility.

Keywords/Search Tags:

Voice Conversion, Adaptive Gaussian Mixture Model Classfication, Bilinear Frequency Warping, Parallel, Non-parallel

PDF Full Text Request

Related items

1	Research On High Quality Voice Conversion Algorithm Based On Improved GMM And Frequency Warping
2	Non-parallel Corpora Voice Conversion Based On Structured Gaussian Mixture Model Under Constraint Conditions
3	Voice Conversion Using Structured Gaussian Mixture Model In Eigen Space
4	Research On Technologies Of Voice Conversion Based On Gaussian Mixture Model
5	Adaptive Gaussian Mixture Model And Its Application In Speaker Recognition
6	Voice Conversion Based On GMM And Codebook Mapping
7	Key Algorithm In High Quality Voice Conversion System
8	GPU-Based Parallel Optimization Of Adaptive Gaussian Mixture Background Modeling Algorithms
9	Non-parallel Many-to-many Voice Conversion Based On Dynamic Convolution StyleGAN
10	Research On Methods For Voice Covnersion