| Handwashing action recognition is an important part of hand hygiene compliance monitoring and one of the important habits in people’s daily life.Although traditional hand hygiene compliance monitoring can detect whether the subject is washing hands,these methods often rely on sensors for monitoring and cannot identify every step in the hand washing action.In recent years,machine learning and deep learning algorithms have developed rapidly and have been widely used in the field of action recognition.In the task of real-time recognition of hand-washing actions,the problem of low accuracy of network model in hand-washing action image recognition,and it is difficult to support the operation of large-scale network models under the condition of limited hardware resources is studied.The specific research contents are as follows:Since the hand area accounts for a small proportion of the overall image and the environmental noise accounts for a large proportion,it is difficult for the recognition network to effectively detect the hand position.The Open CV image library is used to extract the image frame from the video stream data,and the YOLO v5 network is used to detect the hand area of the image frame.The average accuracy of the hand region detection model based on the YOLO v5 network on the dataset is 98.93%,and the average detection speed reaches 55.56f/s.The experimental results show that the hand region detection network can not only effectively detect the hand position in the image,but also has a good detection speed.Due to the limited hardware resources of embedded or mobile devices,it is difficult to run network models that occupy high hardware resources.Four neural networks,VGG16,Res Net34,Mobile Net v2 and Efficient Net-B0,are used to conduct comparative experiments on the dataset.In the experiment,the accuracy rate of the Mobile Net v2 network is 62.6%,the network size is 3.50 M,and the video memory occupied during the training process is 1.54 G.The overall performance is higher than the other three recognition networks.The experimental results show that the Mobile Net v2 network is more suitable for hand washing recognition task.In order to further improve the accuracy of the recognition model,based on the Mobile Net v2 network architecture,a two-stream convolutional neural network model is proposed.First,in view of the small pixel-level feature difference between different handwashing action images,the CBAM attention mechanism is introduced into the inverted residual structure,so that the network model can extract more difference information between images.Secondly,for the problem of insufficient feature information utilization,a feature extraction branch composed of residual structure and CBAM is introduced,so that the network model can extract more feature information.Then,adopt the early fusion strategy to fuse the feature information extracted from different network branches.Finally,in the decoding stage,the fused feature information is decoded and a confidence sequence is output.Compared with the original network,the accuracy rate of the dual-stream convolutional neural network is increased by 11.3%,the average precision rate is increased by 0.092,and the average recall rate is increased by0.02.The experimental results show that the two-stream convolutional neural network is not only more suitable for the task of handwashing action recognition,but also effectively identifies the differences between different handwashing steps.In order to make the hand-washing action recognition network more practical,the hand-washing action recognition network is combined with the hand area detection module to build a real-time hand-washing action recognition system and deploy it in the embedded device Raspberry Pi 4B. |