Research On Multi-channel Speech Enhancement Technology Based On Beamforming And Time-frequency Masking

Posted on:2021-10-26

Degree:Master

Type:Thesis

Country:China

Candidate:J Q Chen

Full Text:PDF

GTID:2518306476450814

Subject:Electronics and Communications Engineering

Abstract/Summary:

PDF Full Text Request

Speech enhancement is an important part of front-end acoustic signal processing,which is an important means to improve speech quality,and the premise and foundation of subsequent speech tasks.However,there are various complex and changeable interferences in real life scenarios that seriously affect the quality of the transmitted speech.Therefore,how to improve the quality of noisy speech is a very challenging task.Compared with the traditional single-channel speech enhancement technology,the multi-channel speech enhancement technology can additionally use the spatial information of the speech,and to a certain extent,it is helpful to improve the quality of noisy speech in complex environments.Corresponding research is made on multi-channel speech enhancement technology based on beamforming and time-frequency masking.The main research contents are as follows:(1)The traditional microphone array signal processing techniques are studied.On this basis,the advantages and disadvantages of the classic beamforming algorithm for multi-channel speech enhancement and the commonly used post-filtering algorithm are analyzed.Finally,the subjective and objective evaluation in the existing speech quality evaluation are analyzed,and the PESQ and STOI indicators in the objective evaluation criteria are selected as the objective indicators for subsequent experimental analysis.(2)The time-frequency masking technology,recurrent neural unit and its main variants are studied,and a multi-channel speech enhancement algorithm combining time-frequency masking and recurrent neural network is proposed.Time-frequency masking technology provides a good target for our supervised learning.Compared with traditional neurons,recurrent neural units can make good use of historical information.More importantly,using a recurrent neural network to build a post-filtering algorithm can further improve the quality of the speech after the delay and sum beamforming.The proposed algorithm has verified its effectiveness and superiority on the synthesized data set and the recorded data set.(3)The basic composition structure of convolutional neural network and the theoretical basis of multitask learning are studied,and a multi-channel speech enhancement algorithm combining convolutional neural network and multitask learning is proposed.Convolutional neural networks have a strong ability to automatically learn the required features,and multitask learning is conducive to further improving the generalization ability of the model.More importantly,the use of convolutional neural networks can fuse fixed beamforming and post-filtering algorithm into a whole.The experimental results prove that the proposed multi-channel speech enhancement algorithm combining convolutional neural network and multitask learning is not only effective on the synthesized data set,but also effective in multi-channel speech recorded in the actual scene.

Keywords/Search Tags:

Beamforming, Time-frequency masking, Recurrent neural network, Convolutional neural network, Multitask learning

PDF Full Text Request

Related items

1	Research On Key Technologies Of High Performance Accelerator For Convolution And Recurrent Neural Networks
2	Multitask Learning In Deep Neural Network And Social Media
3	Design And Implementation Of Sound Spectrogram Recognition System Based On Convolutional Neural Network
4	Identification Of The InSAR Persistent Scatterers Based On Deep Learning
5	Recurrent Convolutional Neural Networks With Applications
6	Research On Time Series Classification Algorithm Based On Convolutional Neural Network And Recurrent Neural Network
7	Research On Image Classification Algorithm Based On Circular Convolutional Neural Network
8	Research On Deep Neural Network Model Of Audio Tagging
9	Research On Time Series Forecasting Technology Based On Deep Neural Network
10	Pattern Recognition And Frequency Measurement Of φ-OTDR Based On Convolutional Neural Network