Research On Embedded Zero-Shot Learning Methods Based On Swin Transformer

Posted on:2024-03-05

Degree:Master

Type:Thesis

Country:China

Candidate:J Q Gao

Full Text:PDF

GTID:2568307115964069

Subject:Computer Science and Technology

Abstract/Summary:

PDF Full Text Request

Traditional image classification tasks usually require the use of large amounts of labeled data to train models,but in real life,data collection and labeling are very difficult.Therefore,zero-shot learning algorithms on how to recognize objects with no samples have become a hot research topic.Zero-shot learning aims to solve the classification problem in the absence of samples by using class-level semantic information to establish connections between seen and unseen classes and thus achieve the recognition of unseen classes.Most of the existing zero-shot learning algorithms use deep convolutional networks pre-trained on Image Net to extract features,which ignores the inconsistency of distribution between Image Net and the zero-shot learning benchmark dataset.Aiming at this problem,this thesis uses Swin Transformer as a new backbone network and applies it to the zero-shot learning field for the first time,input original images to obtain visual features based on semantic information using a self-attentive mechanism,and proposes two embedded zero-shot learning algorithms on this basis.The main research work is as follows:(1)An embedded zero-shot learning algorithm based on multi-label semantic guidance is proposed.The algorithm calculates the similarity between the semantic space of seen and unseen classes simultaneously when constructing the embedding space of visual features and semantic information on seen classes,and guides the model to consider the unseen classes that are semantically similar to the current seen class,and then migrates the similarity of the semantic space to the embedding space where classification is finally performed,which alleviates the domain shift problem and thus achieves more accurate classification.(2)An embedded zero-shot learning algorithm based on multi-scale feature fusion is proposed.The rich attribute features are extracted from the images using the hierarchical structure of Swin Transformer,and then the attribute features are aligned with the attribute prototypes to optimize the whole network so that the global features contain more detail information to distinguish fine-grained semantic attributes,alleviating the problem of insufficient detail characterization ability brought by past methods using image depth features.Meanwhile,in this thesis,relevant experimental validations of the proposed two algorithms are conducted on the bird dataset(CUB),scene dataset(SUN)and animal dataset(AWA2),and the results show that both algorithms can achieve good zero-shot classification results.

Keywords/Search Tags:

Zero-Shot Learning, Swin Transformer, Image classification, Feature fusion

PDF Full Text Request

Related items

1	Few-shot Learning Algorithm Research Based On Multi-scale Feature Measurement Fusion
2	Research On Few-shot Image Classification Algorithm Based On Deep Discriminative Feature Learning
3	Research On Few-shot Image Classification Algorithm Based On Deep Learning
4	Research On Few-shot Learning For Image Classification Based On Data Augmentation
5	Research On Zero-Shot Image Classification Based On Deep Feature Representation
6	Class Knowledge To Semantic Fusion Based Zero-Shot Learning For Image Classification
7	Research On Few-Shot Classification Algorithm Based On Enhance Feature Representation
8	Research On Infrared And Visible Image Fusion Method Based On Depth Convolution Neural Network
9	Research On Self-Supervised Few-Shot Image Classification With Uncertainty Information
10	Research On Low Light Image Enhancement Algorithm Based On Deep Feature Fusion