Font Size: a A A

Research On Image Understanding Method Under Limited Target Scene Labeled Data

Posted on:2023-12-23Degree:DoctorType:Dissertation
Country:ChinaCandidate:J J LiFull Text:PDF
GTID:1528306929992649Subject:Control Science and Engineering
Abstract/Summary:PDF Full Text Request
In recent years,deep learning(DL),as the most representative machine learning technology,has made great progress and dominates the development of various image understanding tasks.However,the success of deep learning relies heavily on large-scale labeled data in the target scene.Due to the limitation of collection scene or annotation cost,only a small amount of labeled data of the target scene can be obtained in many cases.If the model is only trained with these few labeled data in the target scene,it will cause an overfitting problem,which results in poor performance.Therefore,it is very important to study the image understanding problem under limited target scene labeled data for expanding the application scenarios of image understanding.To solve this problem,the current widely adopted method is to use an annotated source domain dataset to improve the generalization of the model,so that the model can achieve higher performance in target scenes with only a few labeled data.Since the source domain data and the target domain data are collected from different scenarios,there are some differences between the source domain and the target domain.The difference can be mainly divided into two categories.The first is that the source domain data and target domain data categories are inconsistent.The second is that the distribution of source domain data and target domain data does are inconsistent.Currently,many researchers focus on the above scenarios,forming two research directions:fewshot learning and domain adaptation.For these two issues,although some progress has been made in current methods,there still exists the following challenges:1)In few-shot image classification,the features extracted by current methods only respond to some discriminative regions,resulting in insufficient representation of the features.In addition,current methods generally use a fixed feature extractor,which is difficult to deal with different classification tasks;2)In the domain adaptive image classification,current methods mainly focus on cross-domain feature alignment and ignore the consistency of feature and class weights;3)In some scenarios similar to autonomous driving,it is usually necessary to focus on pixel-level domain adaptive image classification,that is,domain adaptive image semantic segmentation.The game engine can control the generated content,so the source domain data usually belongs to the same category space as the target domain data,which becomes a domain adaptive image semantic segmentation problem.In the domain adaptive image semantic segmentation,current methods rarely focus on the classification accuracy of target domain boundary pixels and instead use multi-stage distillation techniques to improve model accuracy.However,the distillation technique makes the model training process extremely complicated.This dissertation gives specific solutions to these challenges,and the main contributions can be summarized as the following three points:·For the few-shot image classification problem,this dissertation proposes an intactfeatures responsing and task feature learning method.First,this dissertation designs a cross-set erasing inpainting module to force the feature extractor to focus on a more intact object.Specifically,it removes discriminative regions on the target object,and then constrains the consistency of the original and removed images,thereby forcing the feature extractor to focus on more intact target objects.In addition,this dissertation also designs a task-specific feature modulation module,which enables the model to adaptively select appropriate features according to different tasks,thereby improving the adaptability of features to different tasks.The proposed method can obtain considerable gains on multiple baseline few-shot classification models,which proves the effectiveness of this method.·For the domain adaptive image classification problem,this dissertation proposes the probabilistic contrastive learning.Probabilistic contrastive learning is a simple yet efficient loss function.It only needs to replace the features in traditional contrastive learning with probabilities and remove the l2 normalization,which can effectively narrow the distance between the features and the weights.Experiments on multiple domain adaptation tasks show that this simple loss can significantly improve the performance of baseline models,fully demonstrating the effectiveness of probabilistic contrastive learning.·For the domain adaptive semantic segmentation,this dissertation proposes a target domain information mining method.Different from the previous methods using distillation technology,this dissertation establishes an efficient domain adaptive semantic segmentation method that does not rely on distillation techniques by fully mining the target domain data information.First,this dissertation designs a high-confidence target domain pixel cross-domain mixing module to mine boundary information in the target domain.It constructs target domain boundary pixels with correct labels by copying high-confidence pixels in the target domain images into source domain images,thereby improving the discrimination ability of target domain boundary pixels.Furthermore,this dissertation designs a multi-level contrastive learning module to mine discriminative features in the target domain from pixel-level and prototype level,respectively.On two standard domain adaptive semantic segmentation benchmarks,the proposed method not only achieves state-of-the-art performance,but also greatly simplifies the training process.In summary,this dissertation deeply analyzes and studies the image understanding problem under limited target scene labeled data.From the perspectives of intactness and adaptability of features,the consistency of feature and classifier weights,and the target domain information mining,this dissertation proposes a variety of methods to alleviate the dependence of model on large-scale labeled data in the target scene.The experimental results show that the method designed in this dissertation has achieved good results in multiple data-limited scenarios,laying a solid technical foundation for the wide application of image understanding technology.
Keywords/Search Tags:Image Understanding, Few-shot Classification, Domain Adaptive Classification, Domain Adaptive Semantic Segmentation
PDF Full Text Request
Related items