Research On Cross-modal Retrieval Based On Deep Learning And Transfer Learning

Posted on:2021-04-03

Degree:Master

Type:Thesis

Country:China

Candidate:Y X Shao

Full Text:PDF

GTID:2428330605954240

Subject:Software engineering

Abstract/Summary:

PDF Full Text Request

With the advent of the era of big data,a large number of multimedia data is flooding people's digital life.As a new and efficient information retrieval method,Cross-modal retrieval can meet the urgent needs of people for multi-modal information retrieval,which has become a research hotspot.How to mine the semantic information of multi-modal data,and make full use of implicit semantic relationship between different modalities is the emphases and difficulties of cross-modal research.At present,cross-modal retrieval research generally uses multimodal data sets marked by massive samples.However,there are a large number of unlabeled data in industry applications such as vehicle video,surveillance video,and remote sensing images.There are problems such as a small number of available samples,which due to missing modalities,low data quality and high labeling cost.Such data can be defined as a small sample of multi-modal data,which is characterized by less available data,and one modal data is far less than another modal data.It is difficult to train the model with small samples of multi-modal data,resulting in low cross-modal retrieval accuracy,which is defined as a small-sample cross-modal retrieval problem.In order to solve this problem,this thesis has conducted in-depth research on cross-modal retrieval based on deep learning and transfer learning.The main works are as follows:(1)A cross-modal task learning framework based on deep learning is proposed,and an end-to-end Cross-Modal Retrieval and Recognition Net(CMR2Net)is constructed.CMR2 Net uses similarity measurement to fuse features,analyzes the semantic relationships to realize the association of high-level features of heterogeneous data,and solves the problem of semantic calculation between different modalities.To verify the effect of CMR2Net's cross-modal retrieval,the experiment uses a sample cross-matching organization method to construct the Special Vehicles Multimode Dataset(SVMD).The experiment results of image-audio cross-modal retrieval on SVMD show that the CMR2 Net can achieve high retrieval accuracy and can effectively learn the semantic correlation between different modalities.(2)A cross-modal retrieval method for remote sensing images based on transfer learning is proposed.In order to solve the problem of cross-modal retrieval with small sample data,a Transfer Cross-Modal Retrieval and Recognition Net(TCMR2Net)is further constructed.TCMR2 Net transferred the model structure and low-level parameters of CMR2 Net.To verify the effect of TCMR2Net's cross-modal retrieval,the experiment uses the visible and near-infrared remote sensing images of GF-2 satellite to construct a Remote Sense Airplane Multimode Dataset(RSAMD).The experiment results of visible-near infrared cross-modal retrieval on RSAMD shows that TCMR2 Net can effectively transfer low-level knowledge in different modalities,and has a higher performance improvement compared with the model that does not use knowledge transfer.Deep learning and transfer learning are used to mine the potential semantic relationship of multi-modal data,and it can make cross-modal retrieval achieve high-precision in small sample datasets,which can effectively save the cost of data preprocessing.The method has certain theoretical guiding significance in solving scientific problems such as small sample cross-modal retrieval and cross-modal target recognition.Related algorithms have certain reference value for the development of application systems for special vehicles recognition of driverless cars,cross-modal target detection of remote sensing images,and intelligent information extraction of remote sensing.

Keywords/Search Tags:

Cross-modal retrieval, Deep learning, Transfer learning, Similarity measurement, Small-sample learning

PDF Full Text Request

Related items

1	Semantic Transfer Hashing Based On Deep Learning For Cross-modal Retrieval
2	Research On Cross-Modal Retrieval Of Image And Text Based On Deep Learning
3	Cross-modal Retrieval Research Based On Correlation Analysis And Structure Preserving
4	Research On Cross-Modal Retrieval Based On Deep Semantic Analysis
5	Research On Cross-Modal Retrieval Algorithm For Similarity Preservation In Deep Adversarial Learning
6	Research On Commodity Image Retrieval Method Based On Cross-modal Technology
7	A Study Of Large-scale Retrieval Based On Cross-modal Hashing
8	Deep Metric Learning For Cross-Modal Retrieval
9	Research On Learning Adaptive Ranking Functions And Deep Features For Person Search
10	Design And Implementation Of Cross-modal Retrieval For Images And Texts Based On Deep Learning And Hashing Methods