| With the rapid development of artificial intelligence,speech interaction technology has become more and more widely used in real life.However,speech signals are very susceptible to noise in the surrounding environment,which leads to delays and false recognitions in speech interaction technology.Speech enhancement technology can effectively improve speech quality and intelligibility,which is an important front-end signal processing technology for speech recognition,speech synthesis and other techniques.Speech enhancement techniques can be roughly divided into two categories: one is traditional enhancement methods based on digital signal processing,and the other is based on supervised learning enhancement methods.The traditional enhancement methods are the basis of speech enhancement techniques and has important research significance.Speech enhancement techniques based on deep-learning have achieved amazing results under the background of big data.Therefore,this thesis will focus on speech enhancement algorithms based on deep neural networks and traditional speech enhancement methods.The main works and contributions can be organized as follows.Firstly,assuming that the amplitude spectrum of the Fourier transform coefficients of the speech signal obeys the Chi distribution,an improved Bayesian estimator based on the auditory perception generalized weighting under the Chi distribution is proposed.The estimator is superior to the traditional Bayesian estimator in noise suppression performance.However,compared with stationary noise,the improved Bayesian estimator is still not ideal for non-stationary noise.The speech enhancement methods based on deep neural network can work well for non-stationary noise.But the network training process takes a lot of time.Experiments have shown that in the advanced neural network training stage,using enhanced features as input features will achieve better results than the original features.Furthermore,the residual noise type of the noisy speech signal enhanced by the Bayesian estimator is also relatively similar,which can reduce the network training time and the number of databases.Therefore,this thesis synthesizes the advantages of this two methods,and then proposes an improved speech enhancement algorithm combined with Bayesian estimator and deep neural network.Finally,aiming at the poor performance for the non-stationary noise using the MMSE cost function in the training stage of deep neural network,this thesis uses the MEE cost function instead of the traditional MMSE cost function.The MEE cost function is added into the previous improved speech enhancement algorithm.Then a speech enhancement method based on the generalized weighted Bayesian estimator and deep neural network combined with auditory perception based on the minimum error entropy optimization criterion is proposed.The effectiveness of this improved method is proved by computer simulation experiments. |