Font Size: a A A

Research On Unsupervised Hashing Methods For Large-Scale Cross-Modal Retrieval

Posted on:2024-06-29Degree:MasterType:Thesis
Country:ChinaCandidate:J M LiFull Text:PDF
GTID:2568306923955929Subject:Software engineering
Abstract/Summary:PDF Full Text Request
Currently,as we enter a new stage of information development,technologies such as artificial intelligence,human-computer interaction and Internet of Everything are changing with each passing day.Among them,multimodal data represented by text,pictures and videos are characterized by large quantity,high dimension and complex structure.Meanwhile,people are no longer satisfied with the similarity query of a single modal,but expect to achieve cross-modal retrieval between different modalities,such as searching text by image or searching image by text.How to efficiently realize cross-modal retrieval and meet the increasingly diverse retrieval needs of people has become one of the research hotspots in the field of information retrieval.Unsupervised cross-modal hash learning has been widely concerned in the field of large-scale retrieval due to its advantages of high retrieval efficiency,low storage cost and better scalability.Unsupervised cross-modal hashing aims to explore the data distribution and geometric structure of different modalities,map the original highdimensional data into binary coding,and keep the similarity of the original data.Although the existing cross-modal retrieval methods based on unsupervised hash learning have achieved well results,there are still some shortcomings.Therefore,this thesis explores two cross-modal retrieval methods based on unsupervised hash learning,the specific contents are as follows.To solve with insufficient mining of latent semantic and shallow robustness of the common subspace learning process,this thesis proposes a Scalable Unsupervised Hashing(SUH)for large-scale cross-modal retrieval.In this proposed SUH,latent semantic labels and common feature information within heterogeneous data are simultaneously exploited by multimodal clustering and collective matrix factorization,respectively.Furthermore,the robust norm is introduced in the two process to make the proposed method insensitive for outliers.Based on the robust consistence exploited from latent semantic information and feature embedding,the hash codes can be learned discretely,avoiding the cumulative quantitation loss.Furthermore,in order to explore abundant correlation information between different modalities,effectively reduce the training time complexity and make the model scalable for large-scale datasets,this thesis proposes a Fast Unsupervised Cross-modal Hashing(FUCH)for large-scale retrieval.Specifically,FUCH proposes a semantic-aware collective matrix factorization to learn robust common representation via exploiting latent category-specific attributes,and introduces Cauchy loss function to measure the factorization process.Accordingly,the above process can effectively embed potential discriminative information into common space,while making the model insensitive for outliers.Moreover,FUCH designs a dual projection learning scheme,which not only learns modality-unique hash function for each modality to excavate its individual properties,but also learns modalitymutual hash function for multimodal data to exploit their correlational properties.This method seamlessly integrates the robust matrix factorization and dual projection learning into a unified hash learning framework,which can embed the potential category semantics,individual semantics and relevance semantics in the original space into low-dimensional Hamming space,contributing to the generation of high-quality hash codes.Furthermore,a highly efficient solution is derived for discrete optimization.Experimental results on three benchmark datasets verify the effectiveness of the proposed method under various scenarios.
Keywords/Search Tags:Cross-modal retrieval, Unsupervised hashing, Matrix factorization, Robust norm, Discrete optimization
PDF Full Text Request
Related items