| Human pose estimation is one of the research focuses and hotspots in the field of computer vision.It aims to analyze images or videos of the human body,detect and locate the key points of the body to obtain pose information.In recent years,thanks to the rapid development of deep learning and convolutional neural networks,human pose estimation networks have been widely applied in motion capture,film,gaming,virtual reality,and other fields.However,there are still various challenges in human pose estimation tasks,such as complex network designs leading to excessive network parameters and computational costs,as well as issues with low detection accuracy due to factors such as occlusion,pose diversity,viewpoint changes,and lighting variations.Therefore,this thesis proposes two algorithms to address the aforementioned problems,and applies one of the algorithms to the development of a fitness scoring system.The main contributions and innovations of this thesis are as follows:To address the issues of excessive network parameters and computational costs,a lightweight human pose estimation network based on HRNet is proposed.The network takes HRNet as the backbone network and uses lightweight modules to replace the standard convolution in the original network,reducing the computational costs and network parameters.Additionally,a parameter-free attention mechanism is added to capture both channel and spatial information features from the feature map,improving the network’s ability to detect key points in the image.Finally,comparative experiments are conducted on the COCO and MPII datasets with current popular network models.The experimental results demonstrate that the proposed network achieves a significant reduction in network parameters and computational costs while sacrificing a small amount of accuracy.The Conv Ne Xt V2 network is introduced into the human pose estimation task and used as the backbone network.To address the issue of low detection accuracy caused by occlusion and other factors,the VAN module,which combines the advantages of convolution and self-attention,is incorporated to improve the backbone network.This enhances the network’s ability to extract local contextual information and adaptively process spatial and channel dimensions,thereby improving the feature extraction capability of the network for small-scale input images and occluded human keypoints,and ultimately enhancing the overall detection accuracy of human pose keypoints.To address the high cost of professional fitness evaluation,the lightweight human pose estimation network(IGSNet)is combined with the DTW algorithm to design and implement a fitness scoring system.The system utilizes the lightweight network for user pose estimation,applies the DTW algorithm to score the poses,and finally provides recommendations based on the scores. |