Font Size: a A A

Learning To Hash For Cross-modal Retrieval

Posted on:2021-04-24Degree:MasterType:Thesis
Country:ChinaCandidate:H T WangFull Text:PDF
GTID:2428330611467604Subject:Software engineering
Abstract/Summary:PDF Full Text Request
In the field of information retrieval,with the rapid development of information technology and mobile Internet,multimedia data has the characteristics of large amounts and rich data types.Today,traditional search engines and methods for uni-modal retrieval cannot meet the demands of users for cross-modal retrieval.Facing with massive heterogeneous multimedia data,the cross-modal retrieval technology based on hash learning has the advantages of high computing efficiency and low storage cost,which is an effective technical means to solve the problem of large-scale multimedia information retrieval.This thesis focuses on the problems of robust hash learning,multi-view hash learning,multi-modal semantic mining,and efficient hash learning framework in cross-modal retrieval.The main research contents are as follows:We propose a Robust Multi-View Hashing method for cross-modal retrieval.Firstly,to enhance the robustness of subspace learning,we use data reconstruction to reduce the information loss when mapping the data to the subspace.Secondly,we construct a novel similarity matrix by using multi-view learning and embed this multi-view similarity structure into the low-dimensional subspace through manifold learning.Finally,we learn the hash codes discretely,which can reduce the quantization error of hash codes.Experimental results show that the proposed method outperforms the other methods.We propose a Supervised Consistent and Specific Hashing method for cross-modal retrieval.Firstly,we explicitly decompose the mapping matrices into the consistent part and modality-specific part so as to excavate the latent semantic shared by multimedia data and the private properties of each modality.Then,the label information is embedded into the hash codes through regression,which improves the discriminative power of hash codes and reduce the time complexity of the algorithm.Experimental results show that the proposed method outperforms the other methods.We propose an asymmetric learning framework and extend the Supervised Consistent and Specific Hashing method to this framework to further enhance the discriminative power of the hash codes.Firstly,we learn a latent semantic subspace for multimedia data by consistent and modality-specific projection matrices.Then,the hash codes are generated by an asymmetric coding function.Different from previous methods,the hash function and hash codes are optimized iteratively in the same objective function and the time complexity of the proposed method is much lower than those of the previous methods.Experimental results show that the proposed method outperforms the other methods.In conclusion,different cross-modal hashing methods are proposed in this thesis from several aspects to improve the accuracy and retrieval speed of cross-modal retrieval,which provides theoretical and technical support for efficient and accurate cross-modal retrieval.
Keywords/Search Tags:Cross-modal Retrieval, Hash learning, Discrete optimization, Multimedia data, iteration
PDF Full Text Request
Related items