| In recent years,the storage capacity of people’s smart devices has been increasing,and the image data retained by everyone is also increasing.At the same time,a large amount of image data is constantly being shared on the Internet.How to accurately and efficiently find the images people need from this massive amount of image is a challenge.People retrieve similar images through feature matching,which are divided into global features and local features.Using only global features or local features for retrieval tasks is called single-stage image retrieval,while using global features for screening and then using local features for reranking is called two-stage image retrieval.The current two-stage retrieval method is better,but this inevitably brings greater computational overhead for image feature matching and more storage space for image feature data.At the same time,it cannot avoid the impact of error accumulation.Singlesegment Image retrieval avoids these problems.Good single-stage image retrieval requires good image feature representation,so this paper studies a single-stage image retrieval algorithm based on attention mechanism and feature fusion,using attention mechanism to extract good global and local features,and using feature fusion to obtain fusion Image feature representation of two feature strengths.The main research content is as follows:(1)An image retrieval algorithm based on local and global feature fusion based on attention mechanism is proposed.In the processing of local features,an iterative attention model with learnable templates is introduced to extract more effective local features.Similar feature pairs are obtained by matching local features between similar images,and then similar feature pairs are used with non-similar images.Compared with the corresponding features of the corresponding features,the loss directly acting on the local features is obtained.At the same time,the model’s own attention map loss is introduced to enhance the feature extraction ability of the iterative attention model.The obtained local features will be fused orthogonally with the global features to remove redundant information on the local features,and obtain the final image feature representation for single-stage image retrieval.Experiments prove that it can improve the performance of image retrieval to a certain extent.(2)A multi-level feature fusion image retrieval algorithm based on visual attention model is proposed.Improving the capabilities of visual attention models on image retrieval tasks through multi-level feature fusion.Swin Transformer is selected as the backbone network for feature extraction,and its multi-level and multi-scale features are used for feature fusion to obtain the final image global representation for single-stage image retrieval.In the feature fusion,pooling and 1×1 convolution kernel are used to downsample the features of the previous stage to achieve the same dimension as the next stage features;in the feature fusion method,the orthogonal fusion method is used to The characteristics of each stage are merged.Experiments have proved that it can play a good role in image retrieval tasks.(3)Aiming at the problem that personal private data needs image retrieval,an image retrieval system is designed and implemented,in which the image retrieval model in the system is the two methods proposed in this paper. |