| Facial expression recognition has always been a hot topic in the fields of affective computing and artificial intelligence.However,due to factors such as differences in data distribution and variations in data collection conditions,well-trained facial emotion classifiers often perform poorly on new databases.Therefore,this paper mainly studies methods for cross-database facial expression recognition,aiming to transfer knowledge from existing facial emotion databases to other databases.This paper uses data feature processing and domain adaptation to improve performance on new databases.The main work is summarized as follows:(1)In order to address the issue of unstable temporal information introduced in facial expressions and reduce domain distribution differences,this paper proposes a domain adaptive method based on spatiotemporal motion attention.Spatiotemporal feature points can calculate changes in spatiotemporal information,which can effectively represent facial muscle movement information when introduced into facial expressions.Since different expressions can cause different degrees of facial muscle movement,spatiotemporal feature points can obtain more stable temporal information by setting thresholds.This method processes the spatiotemporal feature points into attention weight maps,which are then stacked with facial expression features to reflect the high-attention areas of the face during the expression generation process.In addition,due to the large domain distribution differences among different facial expression databases,this method uses a deep domain adaptive network to minimize the maximum mean discrepancy to reduce the feature distribution differences between different databases and improve the transfer performance.This method set up multiple transfer learning tasks on three databases,namely RAVDESS,FABO,and eNTERFACE,and conducted extensive cross-database emotion recognition experiments.The experimental results prove the effectiveness of this method,achieving accuracies of 52.42%,44.44% and 35.96% in crossdatabase emotion recognition experiments with the three databases as target domains,respectively.Compared with the current state-of-the-art methods of the same type,the accuracies are improved by1.19%,1.91% and 0.73%,respectively.(2)To extract more fine-grained information for domain adaptation in multiple sub-feature spaces,this paper proposes a Joint Dynamic Domain Adaptation method based on multi-representation feature extraction.Common domain adaptation methods usually use a single network structure for aligning the source and target domain data features,but this approach often only includes partial information in the feature representation.This method uses a mixed structure for multi-representation feature extraction and maps the original facial expression features into multiple sub-feature spaces to align the source and target domain expression features from different perspectives in multiple subfeature spaces,in order to extract features more comprehensively.For each sub-feature space,it contains unique detail information of two sub-domains within each category.To extract more finegrained information from facial expressions,this method simultaneously minimizes the maximum mean difference and local maximum mean squared difference in each sub-structure through joint dynamic domain adaptation.This reduces the feature distribution differences between different subdomains and improves the overall cross-database emotion recognition performance.Multiple transfer learning tasks were set on the three databases mentioned above,and cross-database emotion recognition experiments were conducted.The experimental results prove the effectiveness of this method,reaching accuracies of 53.64%,43.66% and 35.87%in cross-database emotion recognition experiments with the three databases as target domains,respectively,and increasing the accuracy of the current state-of-the-art methods in the same type by 1.79%,0.85% and 1.02%,respectively. |