| With the development of computer technology,the use of computer vision technology to identify whether a person is wearing a mask or not is of great practical importance.In some specific industries and working environments,such as hospital operating rooms or industrial and mining enterprises with high fumes and dust,as well as epidemic areas with infectious diseases,wearing a mask is an important means and method to protect the relevant personnel.However,the detection effect of the existing mask wearing detection algorithm is not ideal,the anti-interference ability of irrelevant feature information in complex scenes is weak,and the detection of small size targets and targets are obscured and other situations appear to miss detection,false detection;secondly,the current mask wearing detection algorithm is still insufficient in classification,localization accuracy and recognition accuracy.To address these problems,this paper improves YOLOv5 s deep neural network model based on deep learning theory,and designs and implements a mask wearing detection system.The details of the research in this paper are as follows:(1)For the mask wearing detection problem,the images are collected by web collection,personal live-action photography and video frame extraction,and the experimental dataset applicable to this research topic is constructed by combining open source datasets.The dataset in this paper contains a total of 10,629 scene-rich image samples,and the dataset is manually labeled and divided into two parts:training set and test set.(2)To address the problem that the detection performance of the mask wearing detection algorithm is easily affected by complex backgrounds,target occlusion,small target size,diverse colors of mask styles and other complex problems in practical scene applications,this paper makes the following improvements to the YOLOv5 s deep neural network model in order to obtain better performance of mask wearing detection.First,the Swin Transformer Block module is introduced in the backbone feature extraction network and neck fusion network of YOLOv5 s model,whose sliding window self-attention mechanism can learn and fuse the feature information of adjacent windows,so that the network can focus on the important feature information in the image dataset and ignore the irrelevant feature information while conducting global modeling.Second,the spatial pyramid pooling structure is improved by adding a 3×3 convolutional kernel on top of the original pooling layer to enhance the perceptual field of the model,and modifying the spatial pyramid pooling structure into a residual structure to further enhance the network’s ability to fuse four sizes of features,thus improving the detection ability of small targets.(3)A deep learning-based mask wearing detection system is designed and implemented for real-life scenario application problems.The system adopts C/S architecture,and the user can realize the detection of video streams,local pictures and videos through the system interface.The system consists of hardware parts such as Raspberry Pi,local server,camera,HC-SR602 human infrared sensor,B-1 human temperature sensor,SG90 servo,and Bluetooth audio.The system starts the Raspberry Pi to open the camera to collect images when HC-SR602 detects people approaching,and the collected video is transmitted wirelessly to the local server for mask wearing detection of the collected images to determine whether the site personnel are wearing the mask correctly and detect whether the body temperature is normal,and if the results of not wearing the mask,not wearing the mask correctly,and abnormal body temperature are detected,the system uses audio to broadcast If the result of not wearing mask,not wearing mask correctly or abnormal body temperature is detected,the system will use audio to broadcast the corresponding audio to remind the site personnel to wear the mask correctly or prompt the abnormal body temperature to prohibit entry.Experimental and test results show that the algorithm in this paper can make more accurate judgments on mask wearing in complex scenarios such as normal face detection,obscured face detection,side face detection,small size target detection,etc.,and meet the needs of realistic scenario applications. |