| Fine-grained image recognition refers to the task of fine-grained classification of subclasses under the same class.The task is more challenging than coarse-grained image recognition because fine-grained images are characterized by small inter-class differences and large intra-class differences.Extracting features from discriminative regions in images is currently recognized as the key to solving fine-grained image recognition tasks,and attention mechanisms are common schemes used to enhance the feature extraction ability of models.In addition,supervised contrastive learning,which is dedicated to bringing the distance between similar classes and distancing different classes,is suitable for fine-grained image recognition tasks,but the method has requirements for batch size and data augmentation.To address the above issues,this paper conducts the following research:(1)In this paper,a supervised contrastive loss function and a corresponding image preprocessing scheme are proposed.By adopting the region confusion mechanism as the data augmentation scheme,the need for constructing difficult positive samples in contrast learning is solved,meanwhile the destruction of global contour information forces the model to learn more feature information of discriminative regions of fine-grained images.For the limitation of batch size,this paper uses repeated category sampling to enhance the number of positive samples in each batch,which improves the performance of the loss function.(2)This paper designs an attention mechanism module for the region confusion mechanism.In view of the increased recognition difficulty and the edge noise of the image blocks caused by the region confusion mechanism,this paper designs the attention mechanism module to enhance the feature extraction ability of the model,and explores the optimal division granularity and difficulty control scheme for the learning ability of the model,so that the model can learn the features of components of different sizes.Finally,for the introduced edge noise,the model adopts a double-ended feature interaction to suppress it.In summary,this paper proposes a fine-grained image recognition model with a combination of supervised contrastive learning,image preprocessing scheme and attention mechanism with the introduction of a small number of parameters,which has shown good performance in many experiments. |