Font Size: a A A

Research On Multimodal Data Retrieval Algorithms

Posted on:2022-08-08Degree:MasterType:Thesis
Country:ChinaCandidate:W J ZhangFull Text:PDF
GTID:2518306509485064Subject:Software engineering
Abstract/Summary:PDF Full Text Request
Cross-modal retrieval is a way to search between multimodal data.It achieves accurate cross-retrieval by correlating the semantic of different modalities.In recent years,due to the fast query speed and low storage overhead of hashing,many cross-modal hashing methods have been proposed.Most cross-modal hashing methods are limited to image and text data.However,in practical applications,multimedia data usually has two or more media types,which makes it difficult to achieve good results using previous retrieval methods.In this paper,two multimodal data retrieval algorithms are proposed to solve the problems of high-order relationships between data samples and unpaired data instances.The Hypergraph based Discrete Matrix Factorization Hashing(HDMFH)algorithm is proposed to solve the problem of neglecting high-order features among data.It learns a common semantic representation of different modalities data by collective matrix factorization.And it leverages hypergraph learning to model the high-order relationships among instances to enhance the discriminative ability of the learned common representation.Then,the common semantic representation is mapped to Hamming space through the orthogonal rotation technology to generate binary codes.In addition,the supervised semantic labels are combined to bridge the semantic gap across different modalities.Such connection without constraints of any two modalities makes the HDMFH scalable to multiple modalities.The Unpaired Cross-Modal Hashing(UCMH)algorithm is proposed to solve the problem that data instances are unpaired,i.e.,there is no one-to-one correspondence between them.It firstly learns latent subspaces for different modalities respectively,and combines similarity preservation to maintain the intra-modality similarity of the data samples.By orthogonally rotating and quantizing the latent subspaces,the latent features of each modality are encoded into discrete hash codes.Furthermore,the discriminative ability of the learned hash codes is enhanced through the label information.Finally,it constructs an affinity matrix to bridge the semantic gap across different modalities.Such connection enables the UCMH to handle singlelabel and multi-label unpaired cases simultaneously.This thesis verifies the two proposed algorithms by three evaluation metrics,and conducts extensive experiments on cross-modal and multimodal datasets,respectively.The experimental results show that the HDMFH algorithm can effectively complete various retrieval tasks.In addition,in unpaired scenario,the UCMH algorithm also demonstrates that it outperforms the baselines in dealing with cross-modal and multimodal retrieval problems.It also proves the scalability of UCMH in paired scenario.As a whole,two algorithms proposed in this paper can obtain accurate retrieval results,and they are able to meet the requirements of multimodal retrieval tasks.
Keywords/Search Tags:Cross-modal retrieval, Multimodal retrieval, Hashing, Hypergraph learning, Unpaired data
PDF Full Text Request
Related items