Font Size: a A A

Research And Application Of Few-Shot Image Classification Based On Feature Enhancement

Posted on:2024-07-21Degree:MasterType:Thesis
Country:ChinaCandidate:Z H LiFull Text:PDF
GTID:2568307118479754Subject:Computer technology
Abstract/Summary:PDF Full Text Request
Traditional machine learning and deep learning methods require numerous samples for training,but it is often difficult to obtain enough data in practical scenarios.Few-shot Learning can quickly extend the model to new tasks which only contain samples with a small amount of supervised information through prior knowledge,thus improving the performance and generalization.Hence,it possesses significant theoretical implications and practical values.Currently,metric-based methods are one of the mainstream directions for few-shot classification,where the model typically includes two parts,embedding feature extraction module and distance metric module.This thesis proposes improvements to address the deficiencies present in the two above modules of the current relevant methods,and the main contributions are as follows:(1)There are limitations in the existing embedding modules,and to address this issue,this thesis proposes a method based on multi-scale and spatial feature enhancement,using a grouping strategy to construct an embedding module named Pyramidal Group-wise Enhancement Network and combining it with Feature Map Reconstruction distance.The proposed method enlarges the receptive field of the embedding module,distinguishes the importance of spatial positions,and ultimately enhances the discriminability of the extracted features without significantly increasing the number of parameters,or even reducing it.Comparative and ablation experiments are conducted on two public datasets,and the experimental results verified the effectiveness of the proposed model.(2)The group convolution used in the Pyramidal Group-wise Enhancement Network results in the lack of inter-group information flowing across the feature channels during feature extraction by the embedding module.To address this issue,this thesis further improves the proposed embedding module by introducing channel attention mechanism and constructing a Split Channel Attention module to adapt to multi-scale features.The improved embedding module enhances the discriminability of the extracted features.Additionally,in view of the limitations of the metric module in the existing model,this thesis introduces multi-level metrics and proposed a Multi-level Feature Map Reconstruction distance based on Feature Map Reconstruction distance,reusing the shallow features to increase the quantity and types of features.Comparative and ablation experiments are conducted on two public datasets,and the experimental results verified the effectiveness of the proposed model.(3)This thesis applies the above models to the field of bird image recognition and designs and develops a few-shot bird image recognition application system.Different settings are tested on the application system to verify the applicability of the model and the system.This thesis has demonstrated through experiments and validation of application system that the proposed methods can achieve promising performance and can be further applied in practical scenarios.
Keywords/Search Tags:few-shot learning, multi-scale convolution, attention mechanism, multi-level metrics
PDF Full Text Request
Related items