Font Size: a A A

Research On Image Descriptor Calculation Methods For Image Retrieval

Posted on:2024-02-19Degree:MasterType:Thesis
Country:ChinaCandidate:Y TaoFull Text:PDF
GTID:2568307127463784Subject:Software engineering
Abstract/Summary:PDF Full Text Request
Image descriptors are used to describe image content,and their discriminative ability is one of the key factors affecting the quality of image retrieval.At present,the research work of image descriptors can be divided into two categories: constructing image descriptors through low-level visual features and constructing image descriptors with Semantic information through network learning.The low level feature description content is single,which can not capture the deep Semantic information of the image,and there are problems such as high dimension and low representation in practical applications.The traditional network lacks sufficient description of image key point information,and using only the features of convolutional or fully connected layers cannot complete large-scale instance level image retrieval tasks.After analyzing and studying the descriptor formed by traditional manual feature aggregation(VLAD)and the descriptor formed by orthogonal fusion of local and global features extracted by networks(DOLG),compact and efficient image descriptors were designed without increasing the dimensions of the original descriptor,which improved the performance of the original algorithm.And the main work of this article is as follows:(1)The retrieval accuracy of local aggregation descriptors for images can be improved by increasing the number of cluster centers in the codebook,but this can lead to increased vector dimensions and storage space issues.To this end,an image descriptor IM-VLAD(Improved VLAD)combining soft allocation based on codeword expansion and two-level codebook structure is proposed,while maintaining the vector dimension unchanged.In the training codebook stage,K-means clustering algorithm is used to train the first layer of visual codebook for image local features.Then,the second layer codebook is trained based on the characteristics belonging to each cluster center.In the phase of calculating image descriptors,a soft allocation method based on codeword expansion is designed.According to each local feature of the image,a new codeword is expanded in the second layer codebook using the nearest codeword,and its weight is allocated to the nearest codeword.The residual direction corresponding to the local feature can be calculated and accumulated.On this basis,the residual vectors of each local feature are aggregated layer by layer from the corresponding codewords of the second layer to the first layer and concatenated to obtain IM-VLAD.(2)To further improve the retrieval accuracy of orthogonal fusion descriptors,a global attention based orthogonal fusion descriptor is proposed.Expanding convolution is performed in the local feature extraction branch to extract multiscale feature maps.The output features are spliced and then captured through the Global Attention Mechanism(GAM)with multi-layer perceptrons to capture relevant channel spatial information.After processing,the final local features are output.Global eigenvectors are generated from high-dimensional global branches after generalized global pooling and full convolution processing.At the same time,the angular domain degree loss function constraint training of the subcategory center is introduced.Extract the orthogonal values of local features on the global feature vector and concatenate them with the global features to form the final descriptor.Finally,in the public dataset Holidays,UKBench,and Holidays_Flickr1M experiments have shown that IM-VLAD descriptors can significantly improve the performance of traditional features used for image retrieval.Compared with other related improved VLAD algorithms,IM-VLAD still has better image retrieval accuracy.At the same time,experimental results on the public dataset Roxford5 k and Rparis6 k show that the improved orthogonal fusion descriptor can use image level tags end-to-end for training and efficiently complete image retrieval tasks using a single stage.
Keywords/Search Tags:descriptor, code word extension, feature allocation, global attention, feature fusion
PDF Full Text Request
Related items