Research On Hierarchical Multi-View Summarization Towards Image-Text Matching

Posted on:2024-08-01

Degree:Master

Type:Thesis

Country:China

Candidate:Z Y Zhan

Full Text:PDF

GTID:2558306920450824

Subject:Computer technology

Abstract/Summary:

PDF Full Text Request

With the development of computer science and technology,especially in the current era where the amount of image and text information is increasing rapidly,multimodal retrieval and analysis,as one of the hottest topics,are also the focus of the field of image and text matching and are favored by many scholars.In the past few years,how to effectively narrow the semantic gap between visual models and text models and accurately evaluate the semantic similarity between images and text has become a hot research topic for image and text matching tasks.Although there are many related works available,there are still limitations:1)a considerable portion of the work has to some extent overlooked the multi perspective description of visual information;2)The interference of semantic complexity on model retrieval and matching work.Therefore,faced with the aforementioned problems,it becomes very difficult to match a given image with multiple text representations,that is,align the two in the feature space.To address the above issues,this paper proposes a hierarchical multi-view image-text matching method(CAMERA++).At first,the method uses an adaptive gated self-attention mechanism to capture local visual area features and feature information in the text,adaptively mine context information,and control the flow of internal information flow from a more fine-grained feature level.Secondly,this method summarizes the local visual region features based on contextual information reinforcement from multiple perspectives in a hierarchical manner,and aggregates the features at the local visual region level to the features at the image level.After that,the method also uses diversity regularization in a hierarchical manner to reduce the information redundancy between hierarchical multi perspectives.Not only that,this method also considers the distribution of modal information in the feature space more,by fitting the theoretical distribution of modal feature space with the actual distribution to constrain the training process of the model and enhance its robustness.Finally,this method thaws some parameters in the pre trained BERT and fine tunes them to enhance the expressive ability of text branch features.The method proposed in this thesis has undergone sufficient and complete experiments on two publicly available large datasets to verify its authenticity and effectiveness.

Keywords/Search Tags:

Hierarchical Multi-View Summarization, Hierarchical Diversity Regularization, Distribution Fitted, Finetuned BERT

PDF Full Text Request

Related items

1	Multi-document Summarization Based On HLDA Hierarchical Topic Model
2	Research And Implementation Of Document Summarization Based On Combined Multi-Feature
3	An Integrated Summarization Framework with Hierarchical Content Representation
4	Research On Hierarchical Topic Modeling Method For Multi-Document Summarization
5	Research On Hierarchical Classification Based On Label Distribution
6	Research On Generation Method Of Evolutionary Multi-document Summarization Based On Sub-topic Enhancement
7	Semantic Hierarchical Clustering Based Multi-document Summarization Research
8	Research On Multi-view Clustering Methods Based On Non-negative Matrix Factorization
9	Sentence Extraction For Multi-Document Summarization Based On Topic Model And Semantics
10	Research On A Multi-Hierarchical Data Distribution Model