Font Size: a A A

Research On Key Issues Of Domain Adaptation And Its Application In Skin Disease Diagnosis

Posted on:2023-03-19Degree:DoctorType:Dissertation
Country:ChinaCandidate:T XiaoFull Text:PDF
GTID:1524306839979959Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
Traditional machine learning assumes that the training data and test data are independent and identically distributed(IID),and requires a large number of labeled data to train the ideal model.In practice,due to the variation of environment or the limitation of data capture conditions,the collected data may not be independent and identically distributed,which leads to a significant decline in the generalization ability of the model.The frequently used solution is to re-collect and re-label the data and re-build the model.In some scenarios,it is very expensive,if not even impossible,to collect and label data.Domain Adaptation(DA)is one of the proposed techniques to solve this data disparity problem when there are differences in the distribution of training data and test data.Domain adaptation is a sub-field of transfer learning.It has a wide range of applications in computer vision,natural language processing,bioinformatics,and others.The main idea of domain adaptation is to apply the knowledge or patterns learned in the source domain(s)to a different but related target domain.on the one hand,it reuses the existing knowledge and turns the process of traditional machine learning which learns from scratch into a cumulative learning process;On the other hand,it relaxes the IID data assumption of traditional machine learning,the data involved in learning can obey different distributions,which can effectively solve the problem of cross-domain learning.Although there is a relatively rich literature on domain adaptation,we still face several key unsolved issues in domain adaptation such as insufficient feature fitting,insufficient cross-domain generalization ability,insufficient inter-domain distribution adaptation,and negative transfer in a multi-source domain.This thesis proposes solutions to the key issues and the problems of applying domain adaptation in dermatology diagnosis.The proposed solutions are applied and tested in skin disease diagnosis.More specifically,the contributions of the thesis are the following:(1)To address the issue of insufficient feature fitting in traditional domain adaptation,we propose a discriminant subspace learning framework of structure preservation and distribution alignment(SPDA).SPDA statistically aligns the cross-domain margin distribution and conditional distribution and geometrically preserves the global and local structure of the data.An epsilon-dragging technique is proposed to increase the discrimination of source domain data,to reduce the generalization error of domain adaptation from the perspective of domain adaptation theory.SPDA solves the problem of under-fitting caused by only using the geometric structure or statistical attributes of data.Extensive experimental results on five public data sets show that SPDA can alleviate the problem of insufficient feature fitting in traditional domain adaptation methods,lead the current best domain adaptation methods in average recognition accuracy,and even achieve better recognition accuracy on deep features than some of deep domain adaptation methods.(2)To address the issue of insufficient distribution adaptation and insufficient crossdomain generalization ability when the distribution difference between domains is large,we propose the following two methods: The first method is an importance-weighted conditional adversarial(IWCA)domain adaptation network.An importance criterion based on domain similarity and prediction certainty is proposed to assign weights to different samples,which can reduce the harmful effects of difficult-to-transfer samples when reducing their cross-domain class conditional distribution differences.Furthermore,a sample selection criterion derived from the perspective of transfer cross-validation is used to progressively select appropriate pseudo-labeled target samples to fine-tune the target model,which greatly improves the discrimination of the model.The experimental results on the object recognition tasks show that IWCA can achieve the best average classification accuracy.The experimental results of skin disease classification show that when there are large differences between domains,especially when there are noise data,IWCA can achieve 4% to 12 % accuracy improvement compared with other domain adaptation methods.The second method is a batch nuclear-norm-based domain adaptation method to enhance feature discriminability and transferability simultaneously in adversarial domain adaptation(MRE).Firstly,we analyze the reasons that the transferability is enhanced at the expense of degraded discriminability in adversarial domain adaptation.Secondly,we study how to improve feature discriminability.The data of the same category in the label space presents a low-rank structure,and the data of different categories present a high-rank structure.The minimum nuclear norm is used to replace the low-rank constraint,and the maximum nuclear norm is used to replace the high-rank constraint,to enhance the discrimination of features.At the same time,align the distribution of data in feature space and label space to enhance the transferability of features.Validation experiments on skin disease classification show that this method can significantly improve the performance of skin disease classification when there are large differences between domains.IWCA and MRE performed almost equally in different domain adaptation scenarios in the classification of skin diseases.(3)To address the issue of negative transfer in the multi-source domains,we propose a multi-source domain adaptation method based on learning domain-specific representation and domain relationships,LDRDR.The research fully excavates the transferable feature representation from multiple source domains and applies it to the learning of target domains.Firstly,we study the feature extraction network.After the common feature extractor,a domain-specific feature extractor module is added for each source domain target domain pair to fully learn the domain common knowledge and domain-specific knowledge of each source domain.Secondly,we study how to measure the correlation between each source domain and the target domain,and integrate each task discriminator according to the correlation so that the target domain can make the best use of the knowledge of each source domain.Extensive experiments on four object classification datasets show that LDRDR can alleviate the negative transfer problem.The experimental results of skin disease classification across multiple sources show that LDRDR can achieve the best domain adaptation performance.(4)For different domain adaptation scenarios in cross-domain skin disease diagnosis,the three deep domain adaptation methods proposed in this paper have achieved the best domain adaptation performance,and their performance is significantly better than other domain adaptation methods.Compared with the method without domain adaptation,the average performance of these proposed domain adaptation methods are improved by about5%-12%.
Keywords/Search Tags:Transfer Learning, Domain Adaptation, Deep Domain Adaptation, Distribution Matching, Multi-source Domain Adaptation, Image Classification, Skin Disease Classification
PDF Full Text Request
Related items