| Hashing-based methods have received much attention in cross-modal retrieval due to their low storage and fast retrieval speed.The main problem of cross-modal retrieval focuses on how to generate discriminative hash codes and keep more semantic and structural similarly of the data into the hash space as possible.However,existing cross-modal hash algorithms underestimate the importance of semantic differences between different categories of tags,and ignore the balance condition of the hash vector.These problems may result in poor discrimination performance of the learned hash codes.In addition,there are some remaining parts of multi-label deserve further exploration.In order to preserve the precise similarity between heterogeneous data,the manifold structure relationship and the balance of hash vectors,a new supervised hash algorithm is proposed.The main work of this thesis is introduced as follows:(1)A supervised cross-modal retrieval method based on discriminative matrix factorization hashing is proposed,which leverages the collective matrix factorization to dispose the kernelized features and then obtains a shared latent space for feature mapping.To better measure the similarity of heterogeneous data,the proportion of common labels between the data was utilized to describe it.Besides,a balanced matrix was constructed by labels to generate balanced hash vectors to maximize the gaps among different class labels.Experiments on two widely-used multi-label datasets verify the effectiveness of the proposed method in maintaining semantic information and similarity relations.Compared with the optimal compared matrix factorization method JIMFH,the mean average precision of it were improved by 6%and 8% respectively.Comparing with seven advanced cross-modal hashing retrieval methods on two commonly used multi-label datasets,MIRFlickr and NUS-WIDE,DMFH achieves the best mean Average Precision(m AP)on both I2T(Image to Text)and T2I(Text to Image)tasks,and the m APs of T2 I are better,indicating this method can utilize the multi-label semantic information in text modal more effectively.(2)A discrete hashing method based on label guided cross-modal retrieval is proposed.In the light of manifold structure and semantic preserving,this method combines balanced hash vectors,semantic similarity and manifold structure in a joint form,and keeps them into Hamming space.For exploring the potential manifold relationships,a matrix for describing manifold structure is proposed,which is composed by local category distribution of the k nearest neighbors.Besides,to maximize the gaps of different categories,a balanced matrix is constructed by labels to generate hash codes with balanced bits.For multi-label data in the real-world scenarios,we also design a novel multi-label manifold and balanced structure matrix to adapt the real-world scenarios.An effective discrete optimization method is used to reduce semantic loss.Extensive experiments on three benchmark datasets demonstrate that the proposed method achieves about 2% and 3% improved to different cross-modal tasks on average. |