| Deep learning is widely used in the image classification field,but it requires training on large-scale labeled datasets to achieve good performance.However,data and annotations are difficult to be captured in many real-world tasks,such as long-tail data recognition,text classification,and cold start of recommendation systems,leading to the overfitting problem of the training model to a small amount of data,and a significant drop in performance.Therefore,it is of great research significance to alleviate the overfitting problem of the model in the case of insufficient training samples and their annotations and to obtain performance comparable to large-scale data,i.e.,the few-shot learning problem.Existing few-shot methods first pre-train on large-scale labeled base-class datasets to obtain strong feature extraction capabilities,then use transfer algorithms to transfer the pretrained model to the new-class distribution,and finally use a classifier to classify new-class samples.Among them,the episodic-paradigm-based few-shot methods fragmentize the dataset and use a support set encoder to extract the category information of the support set sample to transfer the target set sample feature,thus achieving fast transfer within the model.This is an important research direction in the field of few-shot learning.However,the episodic paradigm-based few-shot algorithm still faces the following problems:In the pre-training stage,the pre-trained model is over-fitted to the base-class support set samples,resulting in insufficiently robust target set sample features extracted by the model;In the transferring stage,the pre-trained model is over-fitted to the pre-trained base-class data distribution,resulting in the model being unable to accurately transfer to the new class distribution;In the classification stage,the existing classifiers do not consider the non-linear feature problem of accurately classifying new class samples due to different data distributions among episodes.To address these problems,this paper conducts research from three aspects:enhancing the robustness of the features extracted by the pre-trained model,accurately transferring the pre-trained model,and classifying non-linear features.This paper aims to improve the classification performance of episodic paradigm-based small-sample algorithms by addressing overfitting problems with a small amount of support set samples and pre-training data.The main contributions of this paper are as follows:1.In the pre-training stage,this paper investigates and verifies the overfitting problem to a small amount of support set samples in the episodic-paradigm-based few-shot pretrained model,particularly focusing on the support set encoder.We design a dynamic filter,namely the Canonical Mean Filter(CMF),to approximately achieve marginal likelihood optimization of the model on all episodes.This effectively mitigates the overfitting problem to the support set samples,enhances the generalization ability of the pre-trained model,and provides robust features for subsequent transferring and classification tasks.2.In the transfer stage,this paper addresses the overfitting problem of the pre-trained model to the pre-training data.We investigate and demonstrate the reasons why few-shot fine-tuning methods degrade the performance of few-shot methods:feature distortion and biased estimation of transfer parameters.To address these issues,we propose a Linear-Probing-Fine-Tuning method with Firth bias(LP-FT-FB),which effectively mitigates feature distortion and biased estimation problems during the finetuning process.This algorithm quickly and accurately transfers the pre-trained model to the new-class distribution,thus improving the performance of few-shot methods.3.In the classification stage,this paper addresses the problem that existing few-shot classifiers do not consider the different data distributions among episodes.We propose an episodic-wise covariance-based classifier,which maps the features extracted by the model to the subspace of each episode and calculates the similarity between them.This method enhances the ability of the classifier to handle features of nonlinear patterns,accelerates the convergence speed of the algorithm,and improves the classification accuracy.Finally,it is further proved that the methods designed in the three stages can be used together to improve the classification performance of episodic-based few-shot methods. |