Font Size: a A A

Cross-Modal Hashing For Surface Material Retrieval

Posted on:2023-05-14Degree:MasterType:Thesis
Country:ChinaCandidate:Z Y YeFull Text:PDF
GTID:2568306836968559Subject:Signal and Information Processing
Abstract/Summary:PDF Full Text Request
With the breakthrough development of the Fifth-Generation communication technology and smart devices,multi-media data with images,texts,and videos as the main content is showing a trend of massive growth.Under the development trend of multi-media data,people have put forward deeper and multi-dimensional business requirements for the experience mode of human-computer interaction,and cross-modal retrieval technology has emerged as the times require.Traditional cross-modal retrieval only involves modal content of a single visual perception such as images and texts,and cannot meet the needs of new-generation applications such as VR,telemedicine,and autonomous driving.Therefore,in order to push the immersive experience of human-computer interaction to a new dimension,we intend to introduce touch,one of the three major human perceptions,into cross-modal retrieval to realize free retrieval of visual data between tactile data,that is,cross-modal material surface retrieve.In addition,with the explosive growth of multi-modal data,cross-modal retrieval will face bottlenecks such as high storage cost and long retrieval time,hindering its further large-scale application.Therefore,we combine hash learning with cross-modal retrieval,take advantage of the significant advantages of two-dimensional hash codes in storage cost and retrieval speed to solve this key bottleneck,and propose cross-modal material surface hashing retrieval.The specific research contents of this paper are as follows:(1)Achieving high-precision cross-modality material surface retrieval requires the support of large-scale and high-quality datasets.However,the existing public datasets are basically based on visual modal data,and their main contents are images and texts,which are difficult to meet the cross-modality requirements.Therefore,we propose a augmentation model based on generative adversarial network for visual and tactile data,which uses generative adversarial networks to fit the original dataset to generate high-quality new data containing complex features.(2)Most of the traditional cross-modal retrieval models directly incorporate all the semantic feature information contained in the original data into the training process of the model,ignoring the large amount of redundant information contained,resulting in unsatisfactory cross-modal retrieval accuracy.To address this issue,we propose a cross-modal material surface hashing retrieval modal based on a self-attention mechanism.Specifically,we exploit the self-attention mechanism to extract cross-modality relevant parts in the original data,cull redundant irrelevant parts,and utilize the relevant parts to provide semantic support for cross-modality material surface retrieval.(3)On the basis of research content two,we realize that both cross-modal correlation information and non-correlated information are of great value for constructing cross-modal material surface retrieval.Therefore,we upgrade cross-modal related information and non-related information into shared information and private information,and propose a cross-modal material surface retrieval based on shared-private information joint enhancement.Specifically,shared information represents semantic feature information that coexists in multiple modal data,and only differs in the form of representation.However,private information represents redundant information,mainly noise and background,which is only implicit in a single modality,and is not universal in multimodal scenarios.Based on shared information and private information,we exploit the complementarity and orthogonality between the two information to construct higher-precision cross-modal material surface retrieval.
Keywords/Search Tags:Cross-modal Retrieval, Material Surface, Data Enhancement, Self-Attention, Hashing Learning, Deep Learning
PDF Full Text Request
Related items