| With the national industrial Internet and intelligent manufacturing 2025 strategy proposed,intelligent and unmanned has become the inevitable trend of the current industrial upgrading.More and more enterprises put forward the demand of audio monitoring,in which abnormal sound detection is an important part of audio monitoring,its core is to determine whether there is an abnormal situation in a certain period of time or a certain location by modeling and analyzing the sound in the industrial environment.Traditional solutions are based on a large number of labeled samples for modeling and analysis,but the disadvantages are obvious,requiring enough audio data(while abnormal audio data is difficult to obtain)and requiring a lot of time,manpower and professional knowledge to annotate audio data.With the rise of self-supervised learning,its powerful modeling ability and no need for sample annotation provide new guidance for this problem.Therefore,this paper uses self-supervised learning to carry out research on abnormal sound detection.The main research contents are as follows:(1)Audio signal preprocessing: The effects of multiple audio feature engineering are studied and analyzed,and short-time Fourier transform is used as the audio feature engineering of abnormal sound detection task.The Berouti spectrum subtraction method is used to reduce the noise of the audio signal,and the data enhancement method is studied and designed.(2)Design of anomaly detection scheme: In view of the problem of data distribution deviation caused by time changes in industrial scenes,the traditional scheme based on classification is abandoned and a scheme based on retrieval comparison is designed,which can effectively overcome the problem of data distribution deviation.At the same time,the traditional classification evaluation index is improved to improve the validity and reliability of evaluation index.(3)Modeling method: A network model with attention is designed in view of the characteristics that spectrum is different from natural images.This model can extract effective spectrum feature information and suppress redundant spectrum feature information,so as to improve the performance of the model.In addition,in view of the scarcity of abnormal data in the industry,the modeling method of self-supervised learning using large-scale unlabeled data sets can not only enhance the feature extraction ability of neural network,but also significantly improve the degree of discrimination between categories.In conclusion,the abnormal sound detection algorithm based on self-supervised learning studied in this paper achieves 88.9% abnormal recall rate and 59.3% abnormal recall rate when the false positive rate is 1/100 and 1/1000 respectively on the industrial scene data set,and the AUC index reaches 97.9%,which meets expectations.The experimental results show that self-supervised learning has better performance than attentional supervised learning and general supervised learning,and the class differentiation is higher,which proves that self-supervised learning is superior to supervised learning in the field of abnormal sound detection. |