Font Size: a A A

Research On Transformer-based Acoustic Model Based On Gated Generative Adversarial Network

Posted on:2023-07-08Degree:MasterType:Thesis
Country:ChinaCandidate:X D LvFull Text:PDF
GTID:2568307043489084Subject:Computer technology
Abstract/Summary:PDF Full Text Request
In recent years,the development of deep learning has significantly improved the accuracy of speech recognition.However,in natural environments,speech recognition systems are often disturbed by environmental noise.Noise data augmentation training which overlays noise on the training set is commonly used to improve the noise robustness of acoustic models,but it makes the model perform worse in quiet environments.Transformer is a common modeling method in speech recognition,with the advantages of fast training speed and good recognition effect.In this thesis,we use gated generative adversarial network to improve the noise data augmentation training effect on the Transformer-based acoustic model.(1)The first work in this thesis is that we introduce generative adversarial network in the Transformer-based acoustic model,The shallow layer of the acoustic model is used as the generator part of the generative adversarial network,and the discriminator part is built independently of the acoustic model.The generator processes the input audio features,feeding into other layers of the acoustic model to get the output,and making it difficult for the discriminator to distinguish the source of the input features.Through adversarial training,the shallow layers of the acoustic model can learn more noise-invariant information,and improve the effect of noise data augmentation training for Transformer-based acoustic model.(2)The second work in this thesis is that we introduce the gated convolutional neural network in the Transformer-based acoustic model,strengthen the model’s feature learning ability,adversarial training enables the gating unit to learn to suppress noise-related features.Experimental results on the Aishell-1 dataset show that compared with the model with only noise data augmentation training,the model that uses the gated generative adversarial network reduces the average relative error rate by 4.4% on the clean test data and 5.3% on the noisy test data without changing the number of model parameters and inference speed.
Keywords/Search Tags:Speech Recognition, Transformer-based Acoustic Model, Generative Adversarial Network, Noise Robustness, Gated Convolutional Neural Network
PDF Full Text Request
Related items