| With the emergence of convolution neural network,it is possible to obtain more advanced features from the original image,which promotes the development of many research fields.However,in the actual training,the too deep network is easy to produce the degradation problem,and the performance is not as good as the shallow network.The emergence of the residual network successfully solves this problem.In this thesis,residual network is introduced as the backbone network to conduct in-depth research in the field of facial expression recognition.In order to improve the feature extraction ability and robustness of the network model,this thesis studies the facial expression recognition algorithm based on feature fusion and attention mechanism based on the infrastructure of residual network.the main work is as follows:Aiming at the problem of insufficient training data and too many training parameters of facial expression recognition algorithm,a facial expression recognition model and algorithm based on transfer learning and residual network is proposed.According to the idea of transfer learning,the Res Net50 model pre-trained on the Image Net data set is migrated to the new data set FER2013,and a variety of techniques are used to enhance the data,which effectively expands the training data set and solves the problem of insufficient training data.In order to solve the problem that the size of the input picture is smaller than the default input size of the residual network,the bilinear interpolation method is selected so that the input picture will not be distorted after nearly 5 times magnification,and the details of the image are better preserved.The proposed model is designed to solve the problem of excessive number of training parameters: first,the network parameters of the pre-training model Res Net50 are transferred to this study,and some convolution layers(layer0 to layer3)are frozen for fine tuning to reduce the number of training parameters.Secondly,considering that the full connection layer is prone to over-fitting due to too many parameters,the global average pooling layer is adopted instead of the traditional full connection layer to reduce the dimension,and random deactivation is added to reduce the dependence of the network model on local features.In order to prevent the model from falling into the local optimal solution,the cosine annealing learning rate attenuation strategy is introduced to further optimize the learning rate.Comparative experiments show that the proposed model and algorithm have good anti-overfitting ability.Aiming at the lack of robustness of the improved algorithm,a facial expression recognition model and algorithm based on feature fusion and attention mechanism is proposed.Making use of the complementarity between the output features of different levels of the Res Net50 network model,the design feature fusion module combines the advantages of each layer into a more recognizable output feature,which is processed together with the last-layer features containing rich semantic features.In this thesis,a RKTM(Runge-Kutta Transformer)attention module is designed and placed after the feature fusion module to recalibrate the fused features.Experiments based on feature fusion and attention mechanism show that the two cooperate effectively to further balance the network.The proposed model and algorithm effectively improve the robustness of the system and further improve the recognition accuracy.Finally,the effectiveness of each module and its impact on the overall model are verified by ablation experiments. |