| In the coal mining industry,over 90% of accidents are caused by the miners’ hazardous actions.This poses a serious threat to both their safety and the economic benefits of the enterprise.Therefore,it is crucial to study the identification methods for miners’ hazardous actions to ensure safe coal mine production.With the development of deep learning and computer vision technology,the application of video structuring technology in identifying hazardous actions in mines has achieved significant results,effectively addressing the difficulty of manual supervision of miners’ hazardous actions and further promoting the construction of intelligent mines.However,due to the unique and complex working environment of coal mines,several main technical issues exist in the current identification of miners’ hazardous actions.First,it is necessary to determine the standards for miners’ hazardous action based on specific coal mine work scenarios and regulations and create a sample dataset accordingly.Second,it is necessary to overcome the difficulty of obtaining critical target features of miners in complex coal mine working environments,such as lighting and noise.Third,balancing and improving the real-time accuracy of identifying miners’ hazardous actions is essential.To address these issues,this study will use a cooperative coal mine as an example to investigate the following main research contents:(1)A dataset on the hazardous action of miners has been developed.To address the issue of a lack of publicly available large-scale datasets on hazardous action among miners,the hazardous action has been classified into two major categories based on the standard of the coal mine safety risk database.The first category is conventional hazardous action,which is further divided into six categories,including correct wearing of safety hats(hat),incorrect wearing of safety hats(err-hat),not wearing safety hats(no-hat),correct wearing of masks(mask),incorrect wearing of masks(err-mask),and not wearing masks(no-mask).The second category is posture and movement-related hazardous behavior,including illegal sitting or lying posture,illegal kicking or hitting,and illegal crossing.Data augmentation techniques have also been used to expand the dataset,in order to enhance the robustness,generalization,and adaptability of the deep learning model to different data distributions.(2)STFNet-A(Swin Transformer-FPN-Networks-Attention)is an object detection model that is realized in real-time and accurately detects regular hazardous actions in the complex coal mine.First,a feature pyramid is built using the Swin Transformer as the backbone network.To successfully solve the issue of sample distribution imbalance,the STFNet(Swin Transformer-FPN-Networks)architecture for object detection is built with the addition of the Quality Focal Loss(QFL)loss function.The model parameter size was31.3M,the detection speed was 59.6FPS,and the accuracy of recognizing common dangerous actions was 94.5%.The Convolutional Block Attention Module(CBAM)attention mechanism is then introduced to merge local and global contextual information.The STFNet-A model is built to minimize the scene noise issue further.The recognition accuracy is higher than STFNet under the same detection speed and parameter size,reaching96.9%.(3)Based on Alpha Pose,an improved model named Alpha-SDL-Pose(Alpha STFNetA DNLA LSTM Pose)has been built in this study.This model can identify dangerous actions in coal mines’ intricate working settings.To improve detection speed and trim model parameters,we first swapped out Alpha Pose’s Faster R-CNN with STFNet-A.Then,to extract data like the target person’s joint locations,mutual connections,and skeleton postures,we introduced a dual non-local attention module(DNLA)and embedded it into the pose estimator.This allowed us to capture longer distances and richer contextual information.Finally,we successfully classified dangerous actions by employing a danger classification network with long short-term memory(LSTM).Alpha Pose’s detection accuracy was enhanced by 2.7% to 74.8% using the MSCOCO 2017 public dataset,and13.6FPS increased the detection speed to 49.4FPS.With our dataset,with 43.9M model parameters,the recognition rate of dangerous postures and movements among mining workers was 94.9%.This model’s real-time performance and accuracy are balanced,and it satisfies the practical application needs of real projects.(4)A platform for a demonstration system has been constructed to track miners’ risky actions.Using the Python language and deep learning Py Torch framework,based on open source visual visual interface component library,integrate lightweight,real-time target detection model STFNET-A and gesture estimation model Alpha-SDL-Pose,designed and realized the miner risk behavior monitoring demonstration system Platforms,the main functions of the platform include: system settings,functional selection,input source selection,camera management,video management,and the confidence control of the intercourse ratio and confidence of hazard behavior monitoring,the visualization results show the statistics of regional and monitoring results. |