| Today,with the continuous enrichment of material life,people pay more and more attention to their own health.Heart rate variability(HRV)reflects the change of the difference between successive heartbeat cycles.It can indicate the activity information of human autonomic nervous system.At present,abundant research results have confirmed that autonomic nervous activity is associated with many diseases,while some cardiovascular and cerebrovascular diseases are more closely associated with autonomic nervous activity.HRV analysis can effectively prevent cardiovascular and other diseases.Traditional HRV analysis methods require equipment to contact with human skin,which may have many limitations in some scenes,and prolonged contact with skin will also bring discomfort.In recent years,various non-contact physiological measurement methods have been studied by scholars at home and abroad.At the moment,remote photoplethysmography(rPPG)is the main method of non-contact physiological measurement.It has the advantages of non-contact,simplicity,low cost and good accuracy.However,motion artifacts,illumination changes,video resolution and other factors also limit the application of rPPG technology in daily life.Based on facial video stream,this paper makes a deep research on non-contact heart rate variability analysis by using deep learning method and rPPG technology.The main research work of this paper is as follows:(1)A rPPG heart rate variability analysis method based on efficient spatiotemporal attention network is proposed.This method uses 3D depthwise separable convolution to replace the standard 3D convolution and constructs efficient spatiotemporal attention network to extract rPPG signal from facial video based on the efficient network architecture in the 2D convolutional neural network.This method not only reduces the complexity of the network,but also effectively improves the accuracy of heart rate variability analysis.At the same time,a lightweight hybrid attention module that combines spatial attention and channel attention is constructed for the network,which allows the network to pay more attention to the extraction of physiological signals and ignore irrelevant information.At the end of the network,a long-term time context enhancement block based on lightweight recurrent convolution neural network structure is introduced to strengthen the learning ability of the network’s long-term time context and improve the network’s robustness to various noise disturbances.Experimental results show that the non-contact HRV analysis method proposed in this paper achieves better results in both time domain and frequency domain features of HRV compared with other methods,and has less network complexity compared with other methods based on 3D convolutional neural network.(2)Aiming at the problem of rPPG information loss in low-resolution facial video,based on efficient spatiotemporal attention network,a two-stage rPPG heart rate variability analysis method based on video super resolution network is proposed.Firstly,a video super resolution network is constructed to recover the lost rPPG information in the low resolution facial video.In the network,a frame recurrent generator is used to optimize the low resolution facial video.In the frame recurrent generator,a loop method is used to take the previously estimated high-resolution frame as the input of subsequent iterations,so that the frame recurrent generator can recover the highfrequency information and temporal information including rPPG information in the video.When training,the spatiotemporal discriminator is used for adversarial learning with the frame recurrent generator.In the spatiotemporal discriminator,the low resolution frame group,the generated high resolution group and the real high resolution group are used as inputs.In this way,the gradient information about the authenticity of spatial structure and short-term time information can be provided for the network.It improves the ability of the frame recurrent generator to recover the high-frequency information and temporal information in video.Then,the recovery ability of the frame recurrent generator for rPPG information is improved by joint training of the frame recurrent generator and efficient spatio-temporal attention network.Finally,the two networks are combined to extract rPPG signal for HRV analysis.Experimental results show that the proposed method can effectively improve the signal-to-noise ratio of rPPG signal extracted from low resolution facial video,and achieve better results in HRV than using low resolution facial video and traditional interpolation methods. |