In recent years,there has been a significant increase in production accidents in the chemical industry,resulting in immeasurable loss of life and property.Through investigation and research,it has been found that a substantial proportion of these accidents are caused by incidents of unauthorized entry and untimely handling of equipment failures.To effectively prevent such accidents,it is crucial to promptly identify these abnormal situations.This thesis focuses on audio as the research subject and conducts studies on sound scene classification and abnormal sound detection algorithms.The objective is to design an intelligent detection system that applies these algorithms to real-world scenarios.The main research objectives are as follows:(1)To address the issue of low accuracy in single-feature input for soundscape classification,a method based on an improved Swin Transformer is proposed.The improved algorithm consists of feature extraction and feature fusion modules.The feature extraction module focuses on global information at different levels,while the feature fusion module achieves multi-scale feature information fusion,enhancing the discriminative power of the extracted features.Appropriate acoustic feature extraction schemes are selected,and comparative experiments are designed to validate the superiority of the improved soundscape classification method and the selected acoustic feature combination.Experimental results demonstrate that the proposed method outperforms current mainstream soundscape classification algorithms on the Urban Sound8 K dataset,with an improvement of approximately 2 percentage points compared to the Swin Transformer model.This validates the feasibility of the proposed model improvement approach.(2)To address the issue of unknown equipment failure types and insufficient abnormal sound samples,an abnormal sound detection algorithm based on an improved Mobile Vi T is proposed.Building upon the Mobile Vi T framework,particular emphasis is placed on improving the Mobile Vi T block structure.The improved Mobile Vi T block consists of modules for local feature extraction,global feature extraction,and feature fusion.The dual-channel feature extraction module performs classification and extraction of global and local feature information,while the feature fusion module combines different semantic feature information.By eliminating point-wise convolutions within the feature extraction module,more feature information is retained.Experimental results show that compared to two baseline systems,the proposed model exhibits varying degrees of improvement in abnormal sound detection performance on four categories of machines: Valve,Pump,Fan,and Slide Rail.This validates the superiority of the proposed method in terms of AUC,Recall metrics,and training parameters.(3)A comprehensive intelligent detection system is designed and implemented,aiming to detect intrusion sounds and equipment abnormal sound events in real-time.The system includes modules for sound scene classification and abnormal sound detection,enabling both online and offline detection capabilities.It demonstrates high scalability and maintainability,and its compliance with practical application requirements is verified through functional testing. |