Research On Unsupervised Domain Adaptation Method Based On Consistency Between Samples

Posted on:2024-08-16

Degree:Master

Type:Thesis

Country:China

Candidate:Y Cao

Full Text:PDF

GTID:2568306941492994

Subject:Software engineering

Abstract/Summary:

PDF Full Text Request

In recent years,the continuous advancement of big data,cloud computing,and computer hardware facilities greatly improves the performance of deep learning techniques and methods.With help of massive data,deep learning obtains satisfactory results in areas such as computer vision and natural language processing.In order to ensure that the classifier learned from training are highly reliable and accurate,deep learning usually relies on two basic assumptions:(1)The training samples(Source domain)and the testing samples(Target domain)satisfy independent and identically distributed;(2)A large number of labeled samples must be used for training in order to learn a good learning model.However,in real-world scenarios,it is not practical to have a dataset with a large number of labels.Due to external factors,the source and target domains often have different feature distributions between them.The target domain may contain data collected from different angles or sensors.Therefore,in order to avoid expensive manual labelling efforts and take advantage of the relevant labeled datasets,researchers undertook a range of work associated with domain adaptation.Most of the current domain adaptation algorithms learn domain-invariant features through distance measures or adversarial methods,but these methods often ignore the hidden classification information in the target domain data.In response to the above problems,this thesis studies the more difficult unsupervised domain adaptation scenario of missing annotations in the target domain,and proposes an unsupervised domain adaptation method based on consistency between samples.By calculating the consistency loss of samples in the target domain,train a classifier applied to unsupervised domain adaptation task,so that the original samples in the target domain and their corresponding perturbed samples are consistent in the final prediction results,helping the model to better adaptation to classification tasks on target domain.Specifically,based on the classical MK-MMD metric,the algorithm considers the structural information of the target domain,and is trained by adding pseudo-labels and designing a reasonable consistency loss function.Through these improvements,the proposed model can make full use of the classification information in the target domain while aligning the feature distributions of source and target domains,resulting in improving the domain adaptation effect.The details of the study are as follows:(1)To address the lack of labeled samples in the target domain in unsupervised domain adaptation,this thesis proposes a method to add pseudo-labels to unlabeled samples according to the confidence level of the samples.First set a reasonable threshold,the corresponding pseudo-labels are added to the samples in the target domain with higher confidence level.After that,the pseudo-label information is continuously updated through iterative training until the model converges to the optimal solution.By reasonably using the pseudo-label information in the target domain,the potential features in the target domain can be effectively explored to improve the accuracy of the classifier in recognizing the samples in the target domain.(2)In the existing unsupervised domain adaptation methods,classifiers with robust decision boundaries for image classification in the target domain cannot be learned effectively.To further address this issue,this thesis draws on the idea of adding pseudo labels and applies a random augmentation method to generate high-quality unlabeled augmented samples in the target domain.By analyzing the connection between the original and enhanced samples,the thesis constructs the proper consistency loss,so that the samples of the same category are as close as possible in the potential subspace,thereby enhancing discriminative ability of the learned representations.The cross-entropy loss of the source domain,the confusion loss of the domains and the consistency loss of the target domain are combined to be used as the overall loss for the training process to improve the generalization performance of the model.This thesis conducts comparative experiments on five public unsupervised domain adaptation datasets,including(SVHN,MNIST,USPS)and(Office-Home,Image CLEF-DA).According to the results,the proposed algorithm achieves good classification performance on all publicly available datasets and demonstrates its effectiveness in unsupervised domain adaptation tasks.

Keywords/Search Tags:

Unsupervised domain adaptation, Pseudo-label, Consistency training, Distance metric, Data augmentation

PDF Full Text Request

Related items

1	Multi-Stage Noise-Resistant Unsupervised Domain Adaptation Method For Causality
2	Research On Domain Adaptation Algorithm Based On Dynamic Inter-Class Distance And Pseudo-Labeled Clustering
3	Research On Unsupervised Domain Adaptation In Deep Learning
4	Research On Person Re-identification Technology Based On Data Augmentation And Domain Adaptation
5	Research On Unsupervised Domain Adaptation Image Classification Method Based On Adversarial Learning
6	Research On Cross-domain Person Re-identification Based On Pseudo-label Optimization
7	Research On Cross Domain Person Re-identification Based On Unsupervised Domain Adaptation
8	Research And Design An Unsupervised Person Re-identification Retrieval System
9	Unsupervised Domain Adaptation Research Based On Domain Relation Utilization
10	Label-free Data Poisoning Attack Against Deep Unsupervised Domain Adaptation