Font Size: a A A

Research On Cross-modal Hashing Methods In Complex Scenarios

Posted on:2024-08-21Degree:MasterType:Thesis
Country:ChinaCandidate:Y SunFull Text:PDF
GTID:2568306923474704Subject:Software engineering
Abstract/Summary:PDF Full Text Request
As technology advances and information transforms,big data,as a new data form,is becoming increasingly linked to human social life.To reap the benefits of computer research in the big data era,we must mine and extract key information from big data and then use it to solve current social problems.Because of the large amount,complex structure,high dimension,and instability of big data,some conventional nearest-neighbor search methods are ineffective.As a compromise scheme for meeting the requirements of large-scale multimedia retrieval,the approximate nearest neighbor algorithm sacrifices some precision to improve retrieval efficiency.Among these,the hashing-based retrieval method is widely used due to its excellent similarity preservation,high query speed,and low storage consumption.As a result,cross-modal retrieval has emerged as an important topic in computer science.In real scenarios,the structure of data is complex.For example,there exists data attached with hierarchical labels that describe the data from coarse to fine granularity.However,the majority of existing methods ignore the supervision information contained in hierarchical labels.Besides,there exists incremental data with new categories,because of which new demands are made on the generalization capability of the model.However,most deep cross-modal hashing methods usually re-accumulate the original and incremental data as training sets and then retrain the model.Therefore,the research on cross-modal hashing retrieval approaches for different data types in complex scenarios is promising.To address the above problems,cross-modal hashing retrieval models for two specific situations are designed in this thesis,respectively.The following are the primary contributions:(1)The first part introduces an innovative supervised hierarchical cross-modal hashing approach for hierarchically-labeled data,namely TwO-step hieRarchical Cross-modal Hashing,TORCH for short.It is divided into two parts:hash codes learning and hash functions learning.During the hash codes learning,the supervised information of the finest labels is embedded into the hash codes and the valued representations of the finest labels,which could improve the discrimination of the model;then,a relationship matrix of cross-layer labels is designed to learn the representations of the finest labels by retaining the hierarchical information of labels.During the hash functions learning,two variants of proposed models are learned,one using linear mappings and the other using nonlinear mappings.(2)The second part presents a new incremental cross-model hashing retrieval method for newly coming data,namely Deep Incremental Cross-modal Hashing,DICH for short.It mainly consists of the following two pharases.In the learning of incremental hash codes,firstly,the representation of the seen labels in the original database is extracted;secondly,the similarity between the seen labels and the unseen labels in the incremental database is defined to supervise the representation of the unseen labels;finally,the representation of the incremental data is directly learned by the label matrix and the unseen labels.In the learning of efficient hash functions,firstly,anchor points are extracted from the original database and incremental database,and then,the deep features of the anchor set are extracted;finally,the hash representation of the anchor set is obtained via the network’s hashing layer,and the hash codes of the learned anchor points are used to update the parameters of the deep network during this process.
Keywords/Search Tags:Cross-Modal Retrieval, Learning to Hash, Label Hierarchy, Incremental Learning
PDF Full Text Request
Related items