A Research On Sound Event Detection And Localization Method Based On Time Series Convolutional Neural Network

Posted on:2024-03-27

Degree:Master

Type:Thesis

Country:China

Candidate:Y X Liu

Full Text:PDF

GTID:2530307139958519

Subject:Computer technology

Abstract/Summary:

PDF Full Text Request

Early acoustic research did not combine the detection and localization of sound signals,but focused on two separate directions:sound event detection and source localization.In recent years,the two directions have been combined to simultaneously recognize and locate sound events,which involves detecting whether a sound event occurs in an audio signal,identifying the event category,and confirming the azimuth and elevation angles of each event’s arrival direction.The SELD baseline system for the DCASE2020 sound event detection and localization task can achieve the above functions,but there are still some issues,such as poor recognition accuracy and the need to improve localization accuracy.This paper proposes the following improvements to the baseline system:（1）Firstly,the paper addresses the insufficient feature extraction problem in the network model of the SELD baseline system for the DCASE 2020 sound event detection and localization task and proposes an improved SR-Bi GRU model.Compared to the baseline system,the CNN layer is replaced with an improved convolutional block consisting of three convolutional blocks,similar to a residual structure,which increases the network depth and solves the problems of vanishing and exploding gradients.The improved convolutional block also includes a squeeze-and-excitation residual convolution module to enhance the network’s ability to extract features between data channels and spatial dimensions.The simulation results of the improved model show an ER_20°error rate of 0.49,an F_20°score of 61.7,an LE_CDlocalization error rate of 18.1for 20 degrees,and an LR_CD localization recall rate of 67.7.Compared to the baseline system’s coefficients of 0.72,37.7,23.5,and 62.0 for the corresponding indicators,there is a significant improvement.（2）Secondly,the paper proposes the SR-TCN network model,an improvement based on the SR-BiGRU network model,to address the insufficient time sequence feature extraction ability of the model.The SR-TCN model replaces the bilinear gating unit network used for temporal analysis and detection with a bilinear time sequence convolutional neural network,improving the model’s ability to extract features from data sets in terms of time continuity.The paper also introduces new fusion features and data augmentation to improve the model’s robustness.The results of the model using FOA format data show ER_20°,F_20°,LE_CD,and LR_CDindicators of 0.45,65.2,16.8,and73.2,respectively.The results using MIC format data show ER_20°,F_20°,LE_CD,and LR_CDindicators of 0.48,62.1,17.9,and 71.3,respectively,all better than the baseline system’s results.This paper mainly improves the baseline system for sound event recognition and localization,improving the accuracy of the model with a small time cost.In the future,sound event recognition and localization can be applied in many fields,such as helping hearing-impaired individuals identify sound categories and sources,enhancing microphone directionality in teleconferences,and helping robots navigate and interact with their surrounding environment.

Keywords/Search Tags:

sound event recognition and localization, residual networks, squeeze-andexcitation attention, temporal convolutional neural network

PDF Full Text Request

Related items

1	Study On Bird Intelligent Recognition Based On Sound Feature Fusion And Attention Neural Network
2	Research On Environmental Sound Event Recognition Based On Deep Learning
3	Research Methods On EEG Emotion Recognition Based On Graph Convolutional Neural Networks
4	Rehabilitation Action Recognition Based On Alphapose And Spatial Temporal Graph Convolution Networks
5	Research On Microseismic Event Recognition Technology Based On Convolutional Neural Network
6	Research On EEG Signal Recognition Method Based On Convolutional Neural Network
7	High-resolution Remote Sensing Image Classification Based On Fully Convolutional Neural Network
8	Research And Application Of Convolutional Neural Network In Classification In Heart Sounds
9	Recognition Hexatic Phase In Complex Plasmas Using Convolutional Neural Networks
10	Research On EEG Based Emotion Recognition Using Deep Learning Method