Font Size: a A A

Research On Cross-Modal Deep Adversarial Metric Learning

Posted on:2023-05-16Degree:MasterType:Thesis
Country:ChinaCandidate:A Q DingFull Text:PDF
GTID:2558306911482254Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
With the rapid popularization and development of mobile Internet and smart devices,the amount of multimodal data(such as images,texts,videos,and audios)has exploded.Since data of different modalities have different distributions of features,it is difficult to directly measure the similarity between data of different modalities,i.e.,the data of different modalities have the heterogeneous gap.Cross-modal metric learning method can effectively solve the problem of heterogeneous gap by learning the distance measurement function between multimodal data and using the function to map data of different modalities to a common sub-space,in which cross-modal similarity can be directly calculated,and which has attracted extensive attention of researchers.However,the existing cross-modal metric learning methods have two problems: one is that the existing methods are mainly designed for single-layer labeled data,and cannot effectively utilize the hierarchical label information when processing data containing hierarchical labels,and the other is that the existing methods ignore the fine-grained intra-modality and inter-modality similarity relationship when learning the measure-ment function.Modality adversarial learning can reduce the difference between the output of different modalities from feature learning networks by introducing discriminant networks,which can provide an effective way for cross-modal metric learning to reduce the heterogeneous gap.Combined with modality adversarial learning,this thesis proposes two cross-modal deep adversarial metric learning methods to effectively solve the above two problems existing in the metric learning methods.The main work of this thesis is as follows:First,the cross-modal deep hierarchical adversarial metric learning is proposed to solve the problem that the existing methods cannot effectively utilize the hierarchical label informa-tion.The method achieves effective utilization of the hierarchical label information by establishing multiple metric learning sub-networks corresponding to each layer in the hierarchical label.First of all,a multi-layer classification learning mechanism is designed,which makes the learned multimodal features retain the similarity relationship between the hierarchical labels.Then,the proposed method combines the modality adversarial learning mechanism to reduce the heterogeneous gap between multimodal features through adversarial training between the modality classification network and the feature learning network.Finally,experimental results on two benchmark cross-modal hierarchical labeled datasets demonstrate that the proposed cross-modal deep hierarchical adversarial metric learning method has bet-ter performance than existing cross-modal metric learning methods.Second,the cross-modal deep adversarial metric learning based on attention mechanism is proposed to solve the problem that existing methods ignore the fine-grained intra-modality and inter-modality similarity relationships.The proposed method utilizes the attention mech-anism to learn fine-grained intra-modality and inter-modality similarity relationship.First,the proposed method builds a self-attention module and a cross-attention module to learn intra-modality and inter-modality similarity relationship,respectively.Second,the proposed method combines modality adversarial learning to reduce the dissimilarity among the gener-ated multimodal features.Then,a self-similarity polynomial loss function is utilized instead of the triplet ranking loss function to further enhance the convergence speed and performance of the model.Finally,experimental results on two benchmark cross-modal datasets demonstrate that the proposed cross-modal deep adversarial metric learning based on atten-tion mechanism outperforms existing cross-modal metric learning methods.
Keywords/Search Tags:Cross-modal, Metric Learning, Modality Adversarial Learning, Hierarchical Label, Attention Mechanism
PDF Full Text Request
Related items