Deep Feature Learning And Disentanglement Of Face Images

Posted on:2024-04-01

Degree:Doctor

Type:Dissertation

Country:China

Candidate:Y D Li

Full Text:PDF

GTID:1528307079989029

Subject:Computer Science and Technology

Abstract/Summary:

PDF Full Text Request

The face contains rich personal information including race,skin color,gender,age,identity,and expression,and is the main channel for identity recognition and emotion expression.It plays an important role in social interpersonal communication and human-computer interaction,with great research value and wide application prospects.Deep feature learning is currently the dominant approach for face feature learning,which has made great progress in face recognition,facial expression recognition,facial age estimation,forged face recognition,and facial attribute editing,etc.In recent years,the ever-increasing demands for solving complex challenges in practical applications has triggered a boom in the realm of facial deep feature learning.This paper proposes a series of deep feature learning and disentanglement methods for different face challenges,which are described as follows.In Chapter 3,a cropping and attention-based approach for masked face recognition is proposed.The global epidemic of COVID-19 makes people realize that wearing a mask is one of the most effective ways to protect ourselves from virus infections,which poses serious challenges for the existing face recognition system.There are two main challenges in current masked face recognition: 1)it is hard for face detection system to accurately detect masked face images,2)most of the effective features of the face are severely corrupted.To tackle the difficulties,a new method for masked face recognition is proposed by integrating a cropping-based approach with the Convolutional Block Attention Module(CBAM).The optimal cropping is explored for each case,while the CBAM module is adopted to focus on the regions around eyes.Comprehensive experiments on several benchmark datasets show that the proposed approach can significantly improve the performance of masked face recognition compared with other state-of-the-art approaches.In Chapter 4,a multi-modal Transformer for facial expression recognition in the wild is proposed.Facial expression in the wild is more challenging due to the issues of unconstrained variations(occlusion,pose,illumination,etc.)and annotation ambiguity due to the subjectiveness of annotators,ambiguous facial expressions,or low-quality facial images.To address this problem,a novel multifarious supervision-steering Transformer for FER in the wild is proposed in this paper,referred as FER-former.In specific,to dig deep into the merits of the combination of features provided by prevailing CNNs and Transformers,a hybrid stem is designed to cascade two types of learning paradigms simultaneously.Wherein,a FER-specific transformer encoder is devised to characterize conventional hard one-hot label-focusing and CLIP-based text-oriented tokens in parallel for final classification.Then,the extracted features are downsampled to obtain diverse spatial cues that enable the model to overcome the issues of occlusion and pose variances.More importantly,FER-former makes image features also have text-space semantic correlations by supervising the similarity between image features and text features.Extensive experiments on popular benchmarks demonstrate the superiority of the proposed FER-former over the existing state-of-the-arts.In Chapter 5,a dual-channel feature disentanglement approach for identity-invariant facial expression recognition is proposed.Facial expression recognition is a challenging task owing to subtle inter-class differences and significant intra-class variations.To address this problem,we propose a novel dual-channel alternation training strategy,in which image pairs with different expressions from the same identity and image pairs with the same expression from different identities are alternately fed into a Siamese network for model training.Unlike previous studies,the extracted features from each branch of the Siamese network are disentangled into three feature subspaces,namely,an expression-related feature subspace,identity-related feature subspace,and shared feature subspace,to reduce the potential negative effects caused by expression-related features contaminated by identity components.To further enhance the ability to pull the same expressions together and push different expressions apart in the feature space,the Hilbert–Schmidt independence criterion(HSIC)is introduced to design an identity-sensitive and expression-sensitive loss function because of its excellent ability to measure the similarity between high-dimensional vectors.Comprehensive experiments on benchmark datasets demonstrate that the proposed approach can produce competitive recognition results compared with state-of-the-art methods.In summary,this paper presents an in-depth study of feature corruption,annotation ambiguity,and feature entanglement in masked face recognition and facial expression recognition.Targeted deep feature learning is achieved through flexible design of network structure and loss function.Extensive experiments show that the methods proposed in this paper are effective at improving the deep feature learning in various application situations.

Keywords/Search Tags:

deep learning, masked face recognition, facial expression recognition, attention mechanism, feature disentanglement, multi-modal learning

PDF Full Text Request

Related items

1	Facial Expression Recognition Research Based On Emotional Feature Disentanglement Learning
2	Study On The Classroom Concentration Analysis Based On Facial Expression Recognition
3	Research On Facial Expression Recognition Methods With Local Occlusion
4	Research On Video Facial Expression Recognition Based On Deep Learning
5	Research On The Facial Expression Recognition Method By Attention Mechanism And Multi-feature Fusion
6	Researches On Facial Expression Recognition Based On Deep Feature Learning Algorithm
7	Facial Expression Recognition Based On Multi-scaled Feature Fusion
8	Research On Face Expression Recognition Based On Deep Learning
9	Video Facial Emotion Recognition Based On Deep Learning
10	Research On Subtle Facial Expression Recognition Algorithms Based On Deep Learning