| Gesture recognition based on millimeter-wave radar has received more and more attention in recent years,due to its advantages of non-contact,strong adaptability,good penetration,high precision,high privacy,strong anti-interference,small size and easy deployment.It has broad application prospects in human-computer interaction,smart home,vehicle control,game entertainment and so on.The key procedure of gesture recognition includes three aspects,i.e.gesture data acquisition,feature extraction and classification.In order to deal with several challenges in dynamic gesture recognition,based on the advantages of millimeter-wave radar,this thesis uses deep learning technology to pursue millimeter-wave radar gesture recognition.The main work are as follows:1.The acquisition and preprocessing methods of gesture data based on millimeter wave radar are studied.First,seven types of dynamic gesture data,i.e.push forward,swipe right,clap,circle clockwise and so on are collected from ten people.Second,the Range-Doppler feature of dynamic gesture is extracted by FFT and clutter suppression method.Then,to segment a individual gesture,dynamic and static frames are divided,which is realized by comparing the amplitude maximum value of the Range-Doppler map of each frame in the gesture flow with a preset threshold.Finally,the Range-Doppler maps are cropped and normalized,and thus a gesture dataset containing 3169 samples is constructed,which can provide data support for the performance evaluation of gesture recognition algorithms.2.In order to fully extract spatio-temporal features from dynamic gestures,a spatiotemporal enhanced convolutional temporal network is proposed.First,the method extract fine-grained features through deformable convolution operations;on this basis,an attention mechanism is used to enhance useful features and suppress redundant information.Second,a long-short-term memory network architecture is introduced,which can effectively establish the spatio-temporal sequence relationship of gesture features.At the same time,considering that different frames contribute differently to the recognition accuracy,the gesture frames are adaptively weighted based on the idea of attention,which can further improve the performance of dynamic gesture recognition.3.In order to effectively leverage amplitude and phase information of dynamic gestures at both global level and local level,a multi-scale 3Dconv-Transformer is proposed.First,the proposed method rearranges the radar Range-Doppler data,which can simultaneously retain the amplitude and phase information of dynamic gestures.Then,a threedimensional convolutional neural network is designed to model the temporal correlation between adjacent gesture frames,which can accurately extract the spatio-temporal information of gesture frames at the local level.On this basis,an Inception block with different sizes of convolution kernels in the same convolution layer is introduced to extract multiscale features of gestures.Finally,to effectively improve the performance of dynamic gesture recognition,the multi-head attention mechanism in the Transformer network is used to extract spatio-temporal features of gestures at the global level. |