Font Size: a A A

Research On Multimodal Data Correlation Analysis Based On Kernel Learning

Posted on:2019-09-12Degree:MasterType:Thesis
Country:ChinaCandidate:H ZhongFull Text:PDF
GTID:2428330566984153Subject:Software engineering
Abstract/Summary:PDF Full Text Request
With the explosive growth of multimedia data over the Internet,how to effectively organize and manage massive multimodal data have become a problem to be solved urgently.The key to solve the problem is to mine the potential correlations across modalities,and its essence is to bridge the semantic gap across modalities.In recent years,cross-modal retrieval and image annotation have been developed to accomplish multimodal correlation tasks.Cross-modal retrieval usually accomplishes image-based text retrieval task and text-based image retrieval task,while image annotation is used for annotating keywords to the untagged images.Both approaches have drawn the attention of researchers.Aiming at the multimodal data correlation,this thesis focuses on cross-modal retrieval and image annotation accomplished by the kernel method,and mines the potential correlations between image and text in the nonlinear space.(1)Multimedia Feature Mapping and Correlation Learning based Kernel Method for cross-modal retrieval is proposed in this paper.Firstly,this algorithm obtains the high-level semantic feature of image and text modalities by employing the deep convolutional neural network and topic model respectively,and improves the generalization ability of modality features by principal component analysis.It effectively avoids the problem of insufficient expression of low level artificial features.Then,the kernel method is used to capture the potential correlations of both modalities in the nonlinear space,and then the cross-modal retrieval task is accomplished.(2)Multiple Kernel Learning Model Based on Weak Learner for image annotation is proposed in this paper.This algorithm utilizes synthetic minority oversampling technique to overcome the problem of Imbalance distribution of keywords in the datasets.Furthermore,multiple kernel learning is used to obtain the internal correlations between images and tags,so that this algorithm is not dependent on the selection of kernel function.In order to further improve the prediction accuracy of the algorithm,boosting process is incorporated into the algorithm to optimize the keyword classifier.The algorithm was evaluated on three different benchmark datasets respectively.Experimental results show that the cross-modal retrieval algorithm based on kernel method proposed in this paper can effectively accomplish the cross-modal retrieval task.The comparative experiment on feature dimensionality shows that the performance of the algorithm has no dependence on feature dimension,and it is applicability to sparse features.At the same time,the image annotation algorithm based on multiple kernel learning proposed in this paper can effectively accomplish the automatic image annotation task and shows good label recall ability.As a whole,the two algorithms proposed in this paper are able to meet the requirements of multimodal correlation,and have good research significance and application value.
Keywords/Search Tags:Kernel Method, Multiple Kernel Learning, Cross-Modal Retrieval, Image Annotation, Multimodal Correlation
PDF Full Text Request
Related items