| In recent years,in the wake of developments in science and technology,humancomputer interaction devices such as robots and smart wearable devices are gradually coming into people’s daily life,providing many conveniences for our work,study and life.In the process of human-computer interaction,improving the practicality,convenience and safety of human-computer interaction and collaboration has become an issue of widespread concern among researchers.As a silent language,hand gestures are natural,direct and expressive,but the high degree of freedom of hand movement and complex environmental changes impose higher requirements for fast and accurate recognition of dynamic hand gestures.Therefore,this thesis focuses on the design and implementation of dynamic gesture recognition method based on deep convolutional neural network in terms of both response speed and recognition accuracy.The main contents of this thesis are as follows.Firstly,in order to achieve fast recognition of dynamic gestures,a dynamic gesture recognition method based on spectral analysis is established by using the characteristics of small data volume and high computational efficiency of sEMG signals.The preprocessing algorithm of sEMG signal based on spectral analysis is designed,and the multi-channel time series signal is converted into a two-dimensional image through time-frequency domain conversion.While removing the interference information of each channel,the time-frequency information of sEMG signal and the spatial information between channels are obtained.On this basis,a lightweight convolution neural network is designed to realize the rapid prediction of dynamic gestures.Secondly,in view of the weak ability of sEMG signal representation and the inability to effectively deal with complex and diverse dynamic gestures,resulting in poor recognition accuracy,a multi-feature fusion dynamic gesture recognition framework is proposed to achieve accurate recognition of dynamic gestures.The AlexNet2 video image feature descriptor is designed to realize the spatiotemporal feature representation of dynamic gestures.At the same time,combined with the different representation capabilities of artificial features and deep features,a dualbranch recognition model based on support vector machines is designed to obtain the prediction probabilities of different features.Finally,a fusion algorithm based on Dempster Shafer evidence theory is proposed,which realizes the decision-level fusion of low-level image features and high-level depth features,and effectively improves the accuracy and robustness of dynamic gesture recognition.Finally,aiming at the problems of complex model and low computational efficiency in multi feature fusion algorithm,a classification model based on spatiotemporal attention mechanism is proposed to realize end-to-end dynamic gesture recognition.Based on the advantages of 3D convolution neural network in 3D data representation,a classification network based on 3D depth residual is built.At the same time,temporal attention and spatial attention modules are designed,and a 3D depth residual classification model based on attention is constructed to obtain discriminative spatiotemporal semantic features,so that the classification network pays more attention to the representation of the target area.Experimental results show that the proposed method can realize fast and accurate recognition of dynamic gestures. |