Font Size: a A A

Research And Implementation Of Fine-Grained Few-Shot Visual Classification Algorithms

Posted on:2024-08-20Degree:MasterType:Thesis
Country:ChinaCandidate:H B LiFull Text:PDF
GTID:2568307079959409Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
Fine-grained Visual Classification(FGVC),as an important research direction in the field of computer vision,has a very wide range of applications,attracting many scholars and researchers to participate in it.The recognition objects are different sub-categories belonging to the same meta-category with a finer granularity.Due to the characteristics of low inter-class variation and high intra-class variation,FGVC tasks become extremely challenging tasks.To address the challenge,current deep models require a large number of labeled samples to support training.However,the annotation of fine-grained images is too expensive,and the collection of fine-grained images also suffers from the challenge of long tailed distributions,which greatly limits the practical application of FGVC.Therefore,the task of Fine-grained Few-shot Visual Classification(FGFSVC),i.e.learning to new fine-grained concepts using few samples,has become an urgent problem to be solved.We studies it from the perspective of metric-based methods and data augmentation based methods.Due to its simple implementation and high performance,metric-based few-shot learning methods currently are very promising.Based on metric methods,we propose an adaptive weighting pyramid convolution neural network model to solve the task of FGFSVC.Firstly,the low rank bilinear pooling method is used to integrate the information in each layer of the convolutional neural network,and generate image representations that contain not only global high-level semantic information,but also medium and low-level local fine information for the fine-grained image to overcome the challenges of FGVC tasks.Then,We introduces a suitable hard samples selection strategy and adaptive weighting method.By selecting infomative samples,and adaptively weighting samples based on their similarities,we learn a better embedding space.The proposed model outperforms the most advanced methods available on three benchmark fine-grained datasets,demonstrating the effectiveness and superiority of the proposed method.Data augmentation based few-shot learning methods by increasing the number and diversity of samples are another important and intuitive way to solve the lack of samples.Based on data augmentation,we propose a feature disentanglement augmentation model to solve the task of FGFSVC.Firstly,through the self attention mechanism,we use feature channels response to locate the more discriminative regions in fine-grained images.At the same time,by constraining the attention of the same category to focus on the same region,we extract class discriminative features for fine-grained images that are more discriminative between classes and similar within classes.Then,based on the assumption that intra-class variance induced by conditions such as posture,background,and illumination are shared among all classes,we proposes a feature disentanglement method that learns the distribution of intra-class variances specific to each class,and generates new samples that contain both class discrimination information and diversity through sampling,effectively alleviating the problem of insufficient samples.The proposed model effectively improves the performance and achieves advanced performances on the two most commonly used fine-grained datasets.
Keywords/Search Tags:Fine-grained visual classification, Few-shot learning, Metric-learning, Data augmentation
PDF Full Text Request
Related items