Font Size: a A A

Research And Application Of Image Retrieval Method Based On Faster-RCNN And Wasserstein Auto-Encoder

Posted on:2020-12-13Degree:MasterType:Thesis
Country:ChinaCandidate:Y Y ZhangFull Text:PDF
GTID:2428330599952936Subject:engineering
Abstract/Summary:PDF Full Text Request
With the rapid development of social networks and user-generated content,the Internet has accumulated a large amount of image data,indicating that people have entered the “image reading era”.How to meet people's accurate and real-time image retrieval demands has become a realistic problem to be solved.Traditional image retrieval methods are difficult to apply to large-scale image retrieval due to limitations of their manual tagging data and keyword matching methods.The deep neural network has superior multi-layer structure and powerful feature extraction ability,and excels in extracting image content features,which alleviates the "semantic gap" from the potential visual information of images to human cognitive semantics to some extent.In order to further refine the search content,improve the retrieval accuracy and reduce the influence of image background factors,image instance-level retrieval has become a hot topic of current research.In this paper,the image retrieval method based on deep learning is combined with the object detection method to extract the global features of images and the local features of objects.At the same time,a Wasserstein convolutional auto-encoder is proposed for dimensionality reduction of image features.The innovations and main contents of this paper are as follows:(1)For instance-level retrieval,the object detection framework Faster-RCNN is applied to extract global features of images and local features of objects.In order to improve the accuracy,we use the retrieval dataset to fine-tune the feature extraction network.Furthermore,in the image re-rank stage,considering the two factors of object class score and feature similarity,a valid region-based spatial re-rank method is proposed to improve the accuracy of image instance-level retrieval.(2)A convolutional auto-encoder model based on Wasserstein distance is proposed to reduce the dimensionality of image features.WCAE is a nonlinear dimensionality reduction model that compresses data and obtains low-dimensional codes while ensuring that information is almost not lost.Due to the introduction of the convolutional layer,WCAE has significant advantages in processing two-dimensional signals.In addition,this paper employs the region max-pooling function to meet the WCAE fixed input size requirements.In short,WCAE is a general method of dimensionality reduction,which is trained in an unsupervised manner and does not rely on data with labeled information.So it has a good application prospect.(3)By combining the Faster-RCNN feature extraction module and the WCAE feature dimension reduction module,this paper implements an accurate and fast image retrieval model.The model completes the coarse-grained retrieval from the image global feature to the fine-grained retrieval of the object local features,as well as the retrieval tasks of different dimensional features before and after dimension reduction.The mean average precision of the retrieval method proposed in this paper is 81.3%,86.9%,76.2% and 80.2% in the public datasets of Oxford5 K,Paris6K,Oxford105 K and Paris106 K respectively.Compared with the current advanced image retrieval methods,the retrieval method proposed in this paper is more effective.
Keywords/Search Tags:deep learning, image retrieval, Wasserstein distance, auto-encoder
PDF Full Text Request
Related items