Research On Sound Event Detection Based On Weakly Supervised Learning

Posted on:2021-04-19

Degree:Master

Type:Thesis

Country:China

Candidate:Q Yang

Full Text:PDF

GTID:2568306104970719

Subject:Information and Communication Engineering

Abstract/Summary:

PDF Full Text Request

In recent years,weakly supervised learning based on weakly labeled audio data has become a hot research issue in sound event detection.In this thesis,due to the problem of weakly supervised learning in sound event detection、the limitation of local sensing field、insufficient labeled data and overlapping of audio events in sound event detection,the deep neural network is improved to improve the performance of sound event detection.Firstly,in order to separate sound events from background scenes or noise,Res2Net Expected Maximum Attention Network(Res2EMANet)model based on time-frequency segmentation network model is proposed for weakly supervised sound event detection..In view of the problem that the general convolutional neural network is limited by the local sensing field and cannot fully capture the long-distance information,this thesis proposes that the combination of the Res2Net network and the expected maximum attention mechanism can effectively increase the sensing field range.Experimental results show that the performance of the proposed model for sound event detection is better than that of the baseline system.Secondly,an improved mean teacher model is proposed for semi-supervised sound event detection in order to improve performance with a large amount of unlabeled data.The improvement of training strategy is that the Stochastic Weighted Average algorithm is applied to sound event detection for the first time,which can speed up the prediction and save the cost.The improvement of the model architecture lies in the use of global weighted rank pooling layer,which can solve the limitation of traditional pooling on the underestimation and overestimation of sound events.Moreover,SpecAugment data enhancement method is adopted to effectively solve the problem of overfitting.Experimental results show that the performance of the proposed model for sound event detection is better than that of the baseline mean teacher system.Finally,aiming at the overlap of sound events in real audio clips,SECapsule Recurrent Attention Neural Network(SECapsRANN)model was proposed for polyphonic event detection.The proposed model combines the advantages of SENet and CapsNet to separate each individual sound event from the mixed overlapping features.The attention mechanism is introduced to make the network pay more attention to significant events.Experiments show that the proposed model can effectively solve the problem of sound event overlap and improve the performance of sound event detection.

Keywords/Search Tags:

sound event detection, weakly supervised earning, attention mechanism, mean teacher model, capsule network

PDF Full Text Request

Related items

1	The Research Of Sound Event Classification And Detection On Semi-supervised Learning Method
2	Semi-supervised Sound Event Detection Based On Deep Neural Network
3	Research On Sound Event Detection Based On Deep Learning
4	Research On Sound Event Detection Technology In Domestic Environment
5	Weakly Supervised Sound Event Recognition On Noisy Label Dataset
6	Sound Event Detection Using Attention Mechanism And Interactive Annotation
7	The Research Of Weakly Supervised Object Detection Based On Attention Mechanism
8	Research On Polyphonic Sound Event Detection With Deep Neural Network
9	Research On Abnormal Event Intelligent Detection In Videos
10	Audio Classification And Sound Event Detection Based On Convolutional Sparse Coding Model