Font Size: a A A

Research On Accent Conversion Based On Attention Mechanism And Accent Representation

Posted on:2024-07-09Degree:MasterType:Thesis
Country:ChinaCandidate:X X ZangFull Text:PDF
GTID:2558307094979439Subject:Master of Electronic Information (Professional Degree)
Abstract/Summary:PDF Full Text Request
Accent conversion is usually the process of converting one person’s pronunciation to another accent while retaining the speaker’s characteristics within the same language.Accent conversion has a wide range of applications,such as in dubbing for movies and personalized text-to-speech synthesis.In recent years,the development of end-to-end accent conversion methods based on phonetic posteriorgrams has improved the effectiveness of accent conversion to some extent.However,phonetic posteriorgrams are considered irrelevant content representations of the speaker and have a high dimensionality,resulting in a lack of speaker characteristics and a decrease in training speed.To address the issues of poor accent and timbre conversion in accent conversion,this article focuses on the following research topics:(1)We propose an accent conversion method based on the concentrated attention mechanism.To address the issue of sparse attention weights in speech representation,we introduce a concentrated attention mechanism in the synthesizer network,which aligns the input and output sequences better and obtains more accurate acoustic features.To better represent the content information of the speech,we use bottleneck features instead of phonetic posteriorgrams and add pitch features to effectively control the speaker’s tone.Experimental results show that the proposed conversion model improves the accent conversion effectiveness,naturalness,and similarity to the source speaker to a certain extent.(2)We propose an accent conversion method that incorporates accent representation.To address the issue of poor accent representation,we use a pre-trained model to train an accent category encoder on a dialect dataset,obtain accent representation,and add it to the synthesizer network.This allows the converted speech to contain more of the target speaker’s accent.Experimental results show that adding accent representation makes the converted speech more similar to the target speaker’s accent.
Keywords/Search Tags:accent conversion, concentrated attention mechanism, bottleneck feature, accent representation
PDF Full Text Request
Related items