Font Size: a A A

Research On Supervised Cross-modal Hashing By Preserving Intermediate State Similarity

Posted on:2020-09-25Degree:MasterType:Thesis
Country:ChinaCandidate:B W XiaoFull Text:PDF
GTID:2428330572988975Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
After decades of development of information technology,today's society has already entered the era of big data.From the individual level,nowadays every one of us is constantly exposed to various types of media data,such as pictures,videos,audio,short messages,etc.From the overall level,the amount of the data generated every day in today's society has exceeded the sum of data generated in the past thousands of years in old society.Moreover,the data has become more and more complex,the feature dimension of data is increasing rapidly and the data often comes with multi modalities.The processing of data is increasingly inseparable from the advancement of machine learning technology.Information retrieval has always been a research hotspot in computer science.In the field of data retrieval,Precise Nearest Neighbor(PNN)retrieval is a commonly used classical method.However,in the era of big data,with the increasing challenges of data storage and retrieval,it is no longer achievable to realize Precise Nearest Neighbor retrieval.At the same time,due to the rapid development of machine learning technology,especially the emergence of Approximate Nearest Neighbor(ANN)retrieval technology represented by hashing methods,providing us an efficient tool for information retrieval in the big data time.Unlike the traditional way of directly comparing data features,the hashing methods map high-dimensional data to a compact binary hash codes,while maintaining the similarity and semantic information between the data.Using Hamming distance between the codes indicates similarity,which can greatly improve the speed of retrieval;at the same time,saving the hash codes instead of the original data can also save the required storage space.Today more and more data appears in multimodal form,which makes the retrieval of data between different modalities have a broader application prospect,such as news texts retrieve images,texts retrieve videos,etc.Compared with the single-modal methods,the cross-modal hashing methods should consider not only the data relationship within the modal but also the data relationship between the modals.With the reserach of these aspects,many new methods have been proposed recently,but some problems still need further consideration.While maintaining the relationship between data,many methods make hash codes preserving similarity,but due to the discrete constraints,it is difficult to optimize the problem.Some methods loosen the binary constraints of the hash codes to make the objective function easy to optimize,but they will increase the quantization error and cause the performance to decline.In some methods,the generation of hash codes and the learning of hash functions are done separately.There are also some methods present complex discrete optimization strategies or objective functions,which are usually time-consuming.Good hash methods should maintain the data relationship while avoiding excessive quantization errors,and the optimization process should be efficient so it's easy for subsequent applications.Since making hash codes maintain data similarity causes so many problems that are difficult to solve,we consider that if we can learn variables taking the place of hash codes to complete the above work,while maintaining its connection with the hash codes so as to conveniently generate hash functions and final hash codes.Based on these consideration,in this paper,a new hashing approach is proposed for cross-modal retrieval,i.e.,Cross-Modal Hashing by Preserving Intermediate State Similarity.It introduces an intermediate representation for each modality of each instance,and presercves the similarity in the intermediate state space.Thereafter,it maps the intermediate representations into binary codes.By doing this,the hash functions and binary codes are learned simultaneously,and avoid relaxing the binary constraints and large quantization error problem.We also propose an iterative algorithm to optimize the function.Extensive experiments are conducted on three datasets including Wiki,MIRFlickr-25k and NUS-WIDE.The results demonstrate that our method outperforms or is comparable to the state-of-the-art methods for cross-modal retrieval.
Keywords/Search Tags:Approximate nearest neighbor search, Big data, Hashing, Cross-modal retrieval
PDF Full Text Request
Related items