| Facial expressions are one of the important ways for humans to convey emotions.Facial expression recognition has always been one of the important research fields in human-computer interaction.Nowadays,it has been widely used in driving assistance,remote education,medical assistance,social security and other fields.Traditional facial expression recognition methods have problems such as insufficient feature extraction and human factors.Convolutional neural networks can learn and extract deeper expression features autonomously,reduce the influence of human factors,and have good generalization and high expression recognition rates.Although convolutional neural networks perform better than traditional methods in feature extraction,deep convolutional neural networks have problems such as many parameters and complex structures,which leads to slow network model training speed.The arbitrariness of feature extraction during the training process will also make the expressive features extracted by the network insufficient.In addition,factors such as lighting,occlusion,and angle in reality will also affect the network’s feature extraction ability,seriously affecting the expression recognition effect of the network model.To address these issues,this thesis studies facial expression recognition based on convolutional neural networks.The main research contents include:Aiming at problems such as many parameters of deep convolutional neural networks,slow convergence speed,and low expression recognition rate,a new AM-VGGNet network based on VGGNet-16 network improvement is proposed.Firstly,by reducing the number of convolutional groups in VGGNet-16 network and replacing the fully connected layer with global average pooling layer to reduce network parameters;using Leaky Re LU activation function instead of Re LU activation function to improve network feature extraction ability;and introducing SE attention module to enhance network’s extraction of key features.At the same time,a new cross-connection structure is constructed to improve the efficiency of network feature utilization.Finally,it is verified on CK+,JAFFE and Oulu-CASIA public data sets.The experiment shows that the improved AM-VGGNet network has increased recognition rates by 3.16%,5.07% and 6.29% respectively on three data sets compared with VGGNet-16 network,which proves the effectiveness of AM-VGGNet network.Aiming at the problem of insufficient facial expression feature extraction caused by natural environmental factors,this study proposes a brand-new multi-scale residual attention module.The module consists of 4 multi-scale residual attention units,each unit includes 1×1、3×3 and 5×5 convolution kernel feature extraction branches and an identity mapping branch.By performing feature fusion in depth dimension,this module can generate multi-scale expressive features.In addition,CBAM attention mechanism is integrated into the unit to obtain important local features in expressions and improve recognition rate of occluded expressions.Based on multi-scale residual attention module,a multi-scale residual attention network is further proposed,where feature residual fusion block is used to fuse shallow and deep features to improve feature integrity and thus improve expression recognition accuracy.Experimental results show that compared with other algorithms,multi-scale residual attention network performs better in expression recognition and has greater advantages in occluded expression recognition. |