| Illness and accidents can lead to motor impairment in the elderly,and home rehabilitation training after illness is particularly important for their health.Intelligent rehabilitation training,by recognizing the patient’s movements and comparing them with standard movements,can guide and supervise the home rehabilitation training.Therefore,this paper investigates rehabilitation action recognition by designing a hierarchical residual structured spatio-temporal graph convolutional network model incorporating attention mechanism,and then fusing it with posture estimation Alphapose,target detection and tracking algorithms to achieve multi-person action recognition.To address the problems of inadequate feature extraction and single-joint feature modeling of existing models,the spatio-temporal graph convolutional network model with hierarchical residual structure is proposed。Inoder to extract multi-scale features more finely without increasing the load to improve the model accuracy,The 7-layer sequential structure of the spatio-temporal graph convolution module GT in the original network is constructed into a layered residual structure GT-Res2Net.In allusion to the problem that the multi-layer hybrid convolution of Res2-STGCN fuses the channel and spatial information of perceptual fields in the process of extracting multi-scale features of skeleton information,and the "grouping"mechanism of layered residuals reduces the relevance of channels,a new spatio-temporal graph module with attention mechanism(GT-Attention),is added after GT-Res2Net to realize the autonomous adjustment of channel features.The improved new module and the original module form the new model Res2SC-STGCN.Bone data features also contain a lot of action-related information,so the dual-stream model is established,Simultaneous extraction of joint and bone features enables the full utilization of skeleton data and the fusion of dual-stream networks using a weighted approach.The above improved model is only for single person action recognition,for the recognition of multi-person action in real scenes,this paper achieves with the help of target detection,tracking,pose estimation and the fusion of the improved model.The above improved model is only for single person action recognition,this paper fuses the target detection,tracking and pose estimation with the improved model to achieve multi-person action recognition in real scenes.The experimental results show that the final obtained optimal models reach 88.60%and 95.11%accuracy for joint flow Top-1,90.58%and 96.12%accuracy for bone flow Top-1,and 91.66%and 97.12%accuracy for fusion Top-1 under the two division criteria of the public dataset NTU-RGB+D,respectively.Compared with the benchmark network(ST-GCN),these accurate values are both greatly improved.At the same time,the recognition accuracy under the self-built rehabilitation dataset are both more than 97%.The fused algorithms achieve better results for the recognition of multi-person actions in different situations in rehabilitation scenarios. |