| Attention mechanism is a mechanism to highlight feature information.Because it can highlight useful information in feature maps and suppress useless information and improve network performance without changing the overall structure of deep learning networks,it has become a research hotspot in the field of deep learning in recent years and has been widely applied.In this thesis,a new attention mechanism module and activation function are proposed to study the defects and deficiencies of the existing attention mechanism modules and activation functions,and they are applied to head pose estimation,and a lightweight head pose estimation network is proposed.The main research contents are as follows:(1)At present,most of the attention mechanism modules not only improve the application accuracy of the deep learning model,but also bring about the defect of increasing the complexity of the model.To solve this problem,we propose a lightweight Efficient Dual-channel Attention Mechanism Module(EDCA).EDCA compacted and rearranged the feature maps along the channel,width,and height respectively.One-dimensional convolution is used to obtain the combined weight information,and then the weight information was segmented and applied to the corresponding dimension to obtain the feature attention.In this thesis,EDCA is fully tested on the image classification dataset mini Image Net and object detection dataset VOC2007.The experimental results show that compared with SENet,CBAM,SGE,ECA-Net,Coordinate Attention,EDCA requires less computation and fewer parameters.(2)Various activation functions commonly used in deep learning networks have problems such as disappearing gradient,unfavorable back propagation and poor probability distribution.To solve these problems,we propose a new Saturation Attention Activation function SAA(Saturation Attention Activation).In theory,compared with activation functions such as Sigmoid,Tanh,Softsign and Hard Sigmoid,SAA has the characteristics of non-centrosymmetry and continuous conductivity,which makes it have a larger gradient and more reasonable range.Many experiments also prove that for two representative visual tasks,image classification and object detection,replacing the existing S-type activation function with SAA function can effectively improve the performance of multiple attention mechanism modules in the existing deep learning network,and increase the accuracy of image classification and object detection.(3)In view of the problems that existing head pose estimation methods rely on facial key points,the large model is not conducive to practical application deployment,and the accuracy is low,this thesis proposes a two-flow lightweight head pose estimation network Neko Net based on no facial key points.Neko Net first used the dual-flow lightweight backbone network to extract the head pose features,then transferred the features to the external attention mechanism module to strengthen the attention to the useful features,and finally used the SSR(Soft-Stage Regression)module to obtain the estimated values of the three head rotation vectors.Many experimental results and test visualization results on the head pose data sets ALFW2000 and BIWI show that Neko Net has a lower parameter number and lower estimation error than recent head pose estimation networks. |