Font Size: a A A

Research On Methods Of Improving Speech Communication Quality Based On Generative Adversarial Network

Posted on:2023-07-10Degree:MasterType:Thesis
Country:ChinaCandidate:T FengFull Text:PDF
GTID:2568306782462684Subject:Control Engineering
Abstract/Summary:PDF Full Text Request
Speech communication in the decline in speech quality due to many reasons,such as speech signal acquisition process will lead to some speech messages for the presence of the background noise is submerged,failure,thus reducing the quality of speech signal,for example,a speech in the process of signal transmission in order to reduce the transmission rate and the compression coding technology can also lead to the loss of speech quality.In addition,the quality of speech signal may be lost to a certain extent in various processing links.In recent years,various algorithms based on deep learning have been gradually studied,among which generative adversarial network technology has shown excellent performance in the field of multimedia signal processing.Therefore,to solve the problem of speech quality loss in the process of speech communication,a speech quality improvement method based on generative adversarial network is proposed in this thesis and applied in the actual system.This thesis first proposes a method to solve the degradation of speech signal quality caused by encoder encoding and decoding in the process of speech signal transmission,and encryption algorithm encryption and decryption in the stage of speech processing.This method bypasses the two problems of optimizing the encoder and optimizing the encryption algorithm,and uses the strong learning ability of conditional generative adversarial network to improve the quality of the speech after decoding and encryption and decryption.The conditional generative adversarial network is constructed by constructing the network of generator G and discriminator D and introducing the jump structure into generator G to guide the generation of enhanced data.The feature vectors in discriminator are input into generator as additional information to construct the conditional generative adversarial network.Speech before using encoder coding and coding of speech after the first group of data sets,using speech after speech before encryption and decryption of the second group of data sets,using encrypted speech and after speech codec encrypted,encoder,and after decryption,the speech of the third group of data sets,using three groups of training data set of GAN model,The trained GAN model is used to improve the quality of three kinds of speech with reduced quality.Objective speech quality evaluation indexes were used to prove the effectiveness of the method.Aiming at the problem of speech quality loss caused by background noise,a speech enhancement generative adversarial network model based on background noise classification is proposed.The model by using neural network to design a background noise classifier,classifier by extracting the Mel frequency cepstrum coefficient of noisy speech,using the convolution neural network to classify all kinds of background noise,good speech marked with noise and classification,for different kinds of background noise in the subsequent speech enhancement network prepared to lead;The least square loss function in generative adversarial networks is optimized and the weight of l1 norm regularization is controlled by the hyperparameter ξ.The absolute value of the difference between enhanced speech and clean speech is constantly adjusted by the super-parameterξ,which solves the problem of the degradation of generated speech quality caused by over-fitting caused by training.A new generative adjunctive network model is designed for speech enhancement tasks,which consists of multiple generators G and one discriminator D.The model is simplified without affecting the performance.Among them,each generator G enhances the noisy speech with a kind of background noise,and the discriminator D is used to judge the enhanced speech and real speech types transmitted by multiple generators G.Compared with wiener filtering,speech enhancement algorithm based on deep neural network and speech enhancement algorithm based on generative adversarial network,the speech enhancement model based on background noise classification has better generalization ability in low SNR environment,which proves the effectiveness of the model.Finally,the end-to-end speech encryption terminal in practical application is introduced,and the function is realized by connecting the encryption terminal with the Jetson Nano development board equipped with two models,and the quality of the enhanced speech is evaluated by subjective speech quality evaluation criteria.
Keywords/Search Tags:Loss of speech quality, Speech coding, Generative adversarial network, Speech quality improvement
PDF Full Text Request
Related items