Font Size: a A A

Research On Audio Event Classification Technology Based On Parallel Neural Network

Posted on:2024-02-14Degree:MasterType:Thesis
Country:ChinaCandidate:Z L QiaoFull Text:PDF
GTID:2568307106468174Subject:Communication engineering
Abstract/Summary:PDF Full Text Request
With the rapid development of unmanned factories in my country and the rapid increase in the demand for home care,indoor sound event monitoring has received widespread attention.Due to the characteristics of fast response and strong privacy protection,audio event classification technology has strong versatility in various indoor environments.However,the current research has problems such as poor quality and quantity of data sets,poor macroscopic classification ability of algorithms,and low response sensitivity of algorithm models,making engineering applications relatively difficult.In view of the above problems,this paper conducts the following research:First,considering the diversity of algorithm evaluation perspectives,the classification-based F1-score,fragment-based F1-score and PSDS comprehensive evaluation indicators were selected respectively.In order to effectively compare and evaluate model performance,the 2022 DCASE task4 baseline algorithm model is selected as the baseline model.Second,train on weakly labeled datasets.Inspired by the CAM classification feature visualization algorithm in the image field,an optimization algorithm based on the Grad-CAM-based frame-level sound event classification algorithm is proposed.The Grad-CAM class activation map backpropagation weights are calculated and the probability distribution of the class activation map is used to infer frame-level sound event classification.The PSDS1 metric,which is more stringent for frame-level inference sensitivity,is degraded due to not utilizing strongly labeled data.The experimental results show that the PSDS1 score of the algorithm is 0.205,which is0.151 lower than the baseline system;the PSDS2 score is 0.616,which is 0.089 higher than the baseline system.Combining the above scores,it can be seen that the algorithm effectively improves the macroscopic classification ability of the sound event classification algorithm.Finally,for the training of mixed-label datasets,an innovative optimization method based on the sound event classification algorithm based on parallel networks is proposed.In terms of model training,by adding a high-precision CNN14 model classification inference pseudo-label loss calculation on the basis of the semi-supervised algorithm,the unlabeled data is converted into pseudo-weak label data that can be used for training.In terms of model reasoning,an innovative algorithm based on parallel network CAM-director reasoning guidance is proposed,so that the weakly supervised branch of the parallel network can participate in the collaborative reasoning of the parallel network.The experimental results show that the PSDS1 score of the sound event classification algorithm based on the parallel network is 0.451,and the PSDS2 score is 0.721.Compared with the baseline system,the improvements are 0.115 and 0.185 respectively,showing better results.In summary,the parallel network-based sound event classification algorithm achieves a significant and comprehensive improvement over the baseline model.
Keywords/Search Tags:semi supervised learning, sound event classification, neural network, parallel network, weak supervised learning, Grad-CAM
PDF Full Text Request
Related items