| Speech is one of the main ways for communication between human beings, as well asmachines. It has already become an important tool for modern communication in theinformation times. However, the quality of speech is inevitably affected by the surroundingnoise in real life. In order to eliminate the noise effects and improve the quality of speech,some speech enhancement techniques can be used for preprocessing, thereby effectivelyimprove the performance of systems. It plays an important role in many fields, such asspeech recognition, speech low speed encoding, military communication and automation.Based on the studying of basic knowledge of speech enhancement and of spectralsubtraction algorithms, this paper proposes a new speech enhancement system based onauditory masking effect and noise spectrum estimation algorithm using the constrainedvariance. The main works are explained as follows:This paper analyzes several existed noise spectrum estimation algorithm, includingminimum statistical method and minimum controlled recursive averaging method. And onereal time noise spectrum estimation algorithm is proposed. This algorithm can estimate thenoise power spectrum quickly and accurately, while fitting to rapid changes of noise better.Its advantage lies on the independent of noise level. It estimates the smoothing parametersby taking use of the statistics of smoothed noisy speech, while restricting the variance ofsmoothing short-term noisy spectrum. Therefore it reduces the estimation bias caused bythe tracking of minimum values. In this way, we can obtain the minimum values ofsmoothing short-term spectrum quickly and accurately. Thus the minimum values can beused for voice activity detection and update the estimated noise spectrum. Compared withother noise spectrum algorithms, this method can estimate noise power spectrum faster andmore precisely, improving its performance greatly.There is a problem that most enhancement algorithms only emphasis on SNR, ignoring the speech distortion. To solve this problem, a speech enhancement algorithmbased on auditory masking threshold and prior SNR is proposed. In order to get thecompromise among the elimination of background noise, the suppression of residual noiseand the reduction of speech distortion, this algorithm emphasize the gain function. First,we do perceptual restriction to the gain function, keeping the unheard residual noise whoseenergy is below the masking threshold and reduce the speech distortion. Then, the costfunction is introduced. We get the original gain function. At the same time, in order toreduce "musical noise", we introduce the instantaneous SNR. It takes use of the smoothingparameters with changes of time-frequency at any time to estimate one more accurateapproximation of the instantaneous SNR. The final gain function can be adaptivelyadjusted through the human masking properties inside frames and the changes of a prioriSNR among frames. This can reduce residual noise and decrease speech distortion as well.The experimental simulation is conducted in this paper. Both subjective and objectiveevaluation criteria are used to evaluate the enhanced speech respectively. The experimentalresults show that compared to the commonly used methods, the algorithm in this paper canbetter remove the background noise, suppress the residual noise effectively with theminimum speech distortion. |