| Few-shot learning is a machine learning technique designed to train models for tasks such as classification or regression by using very small amounts of training data.Traditional deep-learning models usually require a large amount of data to train,they often perform poorly when applied directly to few-shot problems.Meta-learning refers to the process of learning how to learn.Its goal is to enable machines to quickly adapt to new tasks and environments,and to show strong learning ability in the face of unknown situations,so it is widely used to solve few-shot problems.From the perspective of meta-learning,this thesis studies how to use the meta-learning framework to achieve fast learning and generalization on few-shot tasks.The research content of this thesis is as follows:(1)Model-agnostic meta-learning(MAML)is studied and the reason for its effect is understood from the perspective of Bayesian theory,that is,the trained model as a whole is prior to a single task,while the trained single task is the maximum posterior estimate of the model as a whole.Therefore,the update rule of the inner loop is modified,and only the parameter update of the last linear layer of the model is carried out.The experimental results show that the model performance is not affected after modification.On this basis,to improve the effectiveness of the inner loop on the model and make the updating process of the inner loop more suitable for the current task,two kinds of adaptive strategies of the inner loop hyperparameter are proposed.Hyperparameter adaptive based on gradient descent(HAGD)and hyperparameter adaptive based on network generation(HANG).HAGD realizes the update of the hyperparameter by taking the derivative of the hyperparameter with the loss function,while HANG takes the model parameter and the loss function as the function of the hyperparameter and generates the hyperparameter suitable for the current task through the fully connected network.The experiment shows that the two kinds of hyperparameter adaptive strategies can improve the classification accuracy of few-shot to varying degrees,and the effect of HANG is more obvious.(2)To quickly apply the model to new tasks,consider building initialization strategies for specific tasks.For the inner loop of MAML,the maximum value of parameters at each layer after forward propagation of the model and the gradient mean value of the parameters at each layer of the loss function are calculated respectively,and then splicing into a single vector as the information representation of the task.Through the constructed fully connected network,the attention mapping of specific tasks is generated and the initial parameters participate in the training process of the model.To improve the generalization of the model,entropy regular term was added based on the cross entropy loss function to reduce the degree of overlap between classes,extract different feature information between classes,and avoid overfitting.Experiments show that the initialization and entropy regularization of specific tasks are significantly helpful to improve the training effect of meta-learning,and higher accuracy and generalization can be achieved in few-shot classification scenarios. |