Font Size: a A A

Research On Continuous Domain Adaptation Theory And Methods Under Dynamic Visual Scenarios

Posted on:2024-07-28Degree:DoctorType:Dissertation
Country:ChinaCandidate:S C NiuFull Text:PDF
GTID:1528307184480644Subject:Software engineering
Abstract/Summary:PDF Full Text Request
Deep neural networks have made great progress in various challenging tasks,including image recognition,video analysis,etc.One prerequisite behind the success of deep neural networks is that they often assume the environment is static.However,in real applications,the open world is often dynamically changing.For example,sensor degradation or weather change may lead to online changing and shifted test data distributions,and the training data distribution may also vary with the data growing or task changing.In these cases,a fixed deep model may suffer from severe performance degradation and thus model adaptation after deployment is needed.This topic recently has gained increased attention and achieved some progress.However,it still suffers from several limitations,including low efficiency,low stability and catastrophic forgetting on the in-distribution/source domain.This thesis seeks to address these limitations and propose a series of methods to boost the practical relevancy of model adaptation methods.The main novelty and contribution of this thesis are as follows.(1)In real-world applications,data often come in a growing manner,where the data volume and the number of classes may increase dynamically.This will bring a critical challenge for learning: given the increasing data volume or the number of classes,one has to instantaneously adjust the neural model capacity to obtain promising performance.To address this,this thesis presents a neural architecture adaptation method,namely Adaptation e Xpert,to efficiently adjust previous architectures on the growing data.Specifically,this thesis introduces an architecture adjuster to generate a suitable architecture for each data snapshot,based on the previous architecture and the different extent between current and previous data distributions.Furthermore,an adaptation condition is proposed to determine the necessity of adjustment,thereby avoiding unnecessary and time-consuming adjustments.Extensive experiments on two data growth scenarios(increasing data volume and increasing number of classes)demonstrate the effectiveness and superiority of the proposed method.(2)Though test-time adaptation has shown great potential in handling test distribution shifts,it is still not efficient enough and may forget the previously learned knowledge during adaptation.To address this,this thesis first points out that not all the test samples contribute equally to model adaptation,and high-entropy ones may lead to noisy gradients that could disrupt the model.Motivated by this,this thesis proposes an active sample selection criterion to identify reliable and non-redundant samples,on which the model is updated to minimize the entropy loss for test-time adaptation.Furthermore,to alleviate the forgetting issue,a Fisher regularizer is further introduced to constrain important model parameters from drastic changes,where the Fisher importance is estimated from test samples with generated pseudo labels.Extensive experiments on CIFAR-10-C,Image Net-C,and Image Net-R verify the effectiveness of the proposed method.(3)When deploying test-time adaptation(TTA)methods into dynamic wild world,it may fail to improve or even harm the model performance.The wild test scenarios include mixed distribution shifts,small batch sizes,and online imbalanced label distribution shifts,which are quite common in practice.This thesis investigates the unstable reasons and find that the batch norm layer is a crucial factor hindering TTA stability.Conversely,TTA can perform more stably with batch-agnostic norm layers,i.e.,group or layer norm.However,TTA with group and layer norms does not always succeed and still suffers many failure cases.By digging into the failure cases,this thesis finds that certain noisy test samples with large gradients may disturb the model adaption and result in collapsed trivial solutions,i.e.,assigning the same class label for all samples.To address the above collapse issue,this thesis propose a sharpness-aware and reliable entropy minimization method,called SAR,for further stabilizing TTA from two aspects: i)remove partial noisy samples with large gradients out of adaptation,ii)encourage model weights to go to a flat minimum so that the model is robust to the remaining noisy samples.Promising results demonstrate that SAR performs more stably over prior methods and is computationally efficient under the above wild test scenarios.
Keywords/Search Tags:Deep Learning, Dynamic and Open Real World, Online Deep Model Adaptation, Test-Time Out-of-Distribution Generalization, Robustness, Neural Architecture Optimization
PDF Full Text Request
Related items