| Facial expression recognition is the use of computer algorithm strategies to understand human psychology and emotions,which is an essential component of intelligent humancomputer interaction,and is a research hotspot in the field of artificial intelligence,involving many fields such as physiology,psychology,computer,image processing,pattern recognition,etc.At present,facial expression recognition based on deep learning has achieved great success,but there are still some problems that restrict the improvement of expression recognition accuracy,mainly include:1)the lack of structural information of the facial expression features captured by the convolutional neural network-based facial expression recognition limits the comprehensiveness of the expression feature representation;2)the Softmax loss used in convolutional neural network-based facial expression recognition ignores the characteristics that expression classes have large intra-class variability and little inter-class differentiation,thus limiting the accuracy of expression classification;3)the subjectivity of the annotators of expression datasets and the uncertainty caused by fuzzy wild facial images limit the quality of the dataset,and the different dataset acquisition conditions limit the application of facial expression recognition across datasets.To address these problems,this dissertation conducts in-depth research on feature extraction,feature classification,and dataset quality improvement based on static facial expression images,and proposes a series of targeted facial expression recognition methods to improve the accuracy and generalization ability of facial expression recognition model.The main research contents of this dissertation are specified as follows:(1)Aiming at the problem that the facial expression recognition based on convolutional neural network lacks structural information in capturing feature information,this dissertation proposes a facial expression recognition method based on the fusion of structural features and global features.First,facial structural features under different expressions are automatically extracted through the proposed facial landmark-guided graph convolutional neural network.At the same time,a convolutional neural network is applied to the entire face to learn global features for different expressions.Afterwards,the structural features and global features are fused into a more comprehensive high-semantic feature representation.Finally,according to the different importance of the two types of features for different expression classification,the channel attention mechanism is used to obtain the weights of the two types of features through learning,so as to achieve more accurate feature classification.The entire model is learned endto-end,and the experimental results demonstrate the effectiveness of the proposed method and improve the accuracy of the facial expression recognition model.(2)Aiming at the problem of relying on facial landmarks and relying on fixed facial structures designed with prior knowledge when extracting facial structure features under different expressions based on landmark-guided graph convolutional neural network,this dissertation further proposes a facial expression recognition method based on fusion of structural adaptive features and global features.First,more discriminative facial structureadaptive features under different expressions are automatically extracted by the proposed structure-adaptive deep network.Meanwhile,a convolutional neural network is further applied to the whole face to obtain global features.After that,the two types of features are combined to obtain a more comprehensive and discriminative high-semantic feature representation.Finally,the weights are assigned to the two types of features through the channel attention mechanism to obtain more accurate feature representations,which are then classified.The experimental results show that the described method can effectively extract expression features in an end-toend manner without relying on any prior knowledge,increase the discriminative power of the features,and further improve the accuracy of the facial expression recognition model.(3)Aiming at the problem that the Softmax loss used in facial expression recognition based on convolutional neural network ignores the problem of large intra-class variability and low inter-class discrimination of expression classes,this dissertation proposes a facial expression recognition method based on the combination of full isolation loss and Softmax loss.The intraclass and inter-class distances are optimized by the proposed full isolation loss function to enhance the discriminability of expression features,and it is jointly trained with the Softmax loss function.The experimental results show that the joint loss function has more advantages than the existing loss functions,and can learn more easily distinguishable deep expression features.And the joint loss function is applied to the facial expression recognition model based on feature fusion proposed in this dissertation,which can further improve the accuracy of the facial expression recognition model.(4)Aiming at the problems of low certainty of image annotations in expression datasets and poor robustness across datasets,this dissertation proposes a cross-dataset facial expression recognition method based on self-correcting dataset labels.This method optimizes the labels of large-scale datasets by the proposed self-correcting dataset labeling algorithm to improve the accuracy of cross-dataset expression recognition,which is mainly composed of an image importance weight module and a relabeling module.First,an importance weight score is generated and sorted for each training sample through the image importance weight module.After that,the labels are regenerated for the uncertain data by the relabeling module,and the training is iteratively performed,and finally a more robust and accurate model is obtained.In addition,this dissertation also provides an alternative scheme for fusing datasets,which adopts the fusion of source and target datasets to seek the balance of model accuracy and generalization ability,and solves the problems of small scale of cross-dataset expression recognition and ignoring the performance of source datasets.The experimental results demonstrate the effectiveness of the proposed method and further improve the generalization ability of facial expression recognition model. |