Font Size: a A A

Research On Cross-device Acoustic Scene Classification Method

Posted on:2024-07-18Degree:MasterType:Thesis
Country:ChinaCandidate:G JiangFull Text:PDF
GTID:2568307130952999Subject:Computer technology
Abstract/Summary:PDF Full Text Request
Acoustic scene classification is the task dedicated to identifying the environment in which a piece of audio was recorded by analyzing various information in it.However,there are many kinds of recording devices in real life,and the audio recorded by different devices has differences in sampling rate,amplitude,frequency response,etc.,resulting in a significant drop in the accuracy of the acoustic scene classification model.In addition,unlike traditional classification problems,there exists an implicit hierarchical structure between categories in acoustic scene classification.However,existing methods for acoustic scene classification overlook this hierarchical structure information,making it difficult to further improve the accuracy of acoustic scene classification models.Given the above difficulties and challenges,the main work of this thesis is as follows:(1)Propose a cross-device acoustic scene classification method based on multi-level distance embedding learning.To address the issue of significant differences in learned features from similar audio due to different recording devices,this method uses deep metric learning loss and multi-level distance regularization to reduce the distance between audio recorded in the same acoustic scene but with different devices in the embedding space.The distance between non-similar audio is constrained to different distance levels based on their similarity degrees,allowing for the learning of common features that are less affected by device differences.The experimental results on the TAU Urban Acoustic Scenes 2020 Mobile Development dataset show that the proposed method is more robust to devices than existing methods.Compared with the baseline model used,the proposed method improves the classification accuracy by 1.2% on average across multiple devices and by 2.3% for audio from unseen devices.(2)Propose a cross-device acoustic scene hierarchical classification method based on implicit category structure.To fully utilize the hierarchical structure(parent and child classes)between acoustic scenes and extract more feature information,a hierarchical classification framework for acoustic scenes is designed.Two classifiers are used in this framework to classify the parent and child classes of acoustic scenes respectively,and the parent class information is fused during the classification process of child classes,forcing the model to learn the hierarchical structure information between acoustic scenes.In addition,to ensure that the predicted categories conform to the hierarchical relationship,a hierarchical dependency loss is designed to penalize the mismatch between predicted parent and child classes.Ablation experiments on the TAU Urban Acoustic Scenes 2020 Mobile Development dataset verify the effectiveness of hierarchical classification framework and dependency loss.Compared with other advanced methods,the proposed method achieves the leading performance and improves the classification accuracy by 1.1% compared with the baseline model.(3)Design and implement the cross-device acoustic scene classification prototype system.Matlab is used to design the operating interface of the cross-device acoustic scene classification prototype system,and the core algorithm was designed using Python programming language.The system includes acoustic scene dataset upload module,acoustic scene classification model training module,acoustic scene classification module.The effectiveness and practicability of the proposed method are verified by the design and demonstration of the prototype system.
Keywords/Search Tags:acoustic scene classification, cross-device, multi-level distance embedded learning, implicit class structure, hierarchical classification
PDF Full Text Request
Related items