Font Size: a A A

Image Chinese Caption Generation Based On Attention Mechanism

Posted on:2024-02-27Degree:MasterType:Thesis
Country:ChinaCandidate:S F YiFull Text:PDF
GTID:2568306926968149Subject:Engineering
Abstract/Summary:PDF Full Text Request
Image content generation tasks have been attracting attention in recent years.With the continuous development of the field of deep learning,the research hotspot attention mechanism has also progressed,and more and more image content generation models on the attention mechanism have been born,and how to efficiently use image features to generate more accurate and fuller content descriptions has become the main research direction of image content generation.The vast majority of image content generation tasks today revolve around English,but it is also important to meet the needs of native speakers of other languages.Therefore,this paper conducts the following research around image Chinese content generation.Firstly,ResNet-LSTM is selected as the basic structure of the network,and the jieba word segmentation module is used to make the Chinese text input conform to the characteristics of computer text understanding.In order to improve the ability of the model to generate more accurate image content,an improved channel attention mechanism is added to the decoder part,and an image Chinese content generation model based on the improved channel attention mechanism is proposed.By comparing the experimental data on AI Challenge 2017,the experimental effect of this model has been improved on the original model,the model with the original channel attention mechanism,and the model with the spatial attention mechanism.Secondly,a multi-attention mechanism fusion method is proposed to solve the problem of missing some key information in image content caused by different attention levels of a single attention mechanism.This method is to linearly fuse the improved channel attention mechanism and spatial attention mechanism and add it to the decoder section of the ResNet-LSTM model.Similarly,comparing the experimental data on AI Challenge 2017,the experimental effect of this model was significantly improved in the original model,the model with the original channel attention mechanism,the model with the spatial attention mechanism,and the model with the improved channel attention mechanism.Finally,this paper designs a Chinese image content generation dataset that meets the needs of daily life to solve the scarcity of domestic Chinese datasets and the gap of daily life Chinese datasets.It was put into a model with an improved channel attention mechanism and a model with a fusion attention mechanism to study the impact of training volume on training results.The performance of each model on this dataset was compared,and the generated content of this dataset on the same image as AI Challenge 2017 was analyzed.
Keywords/Search Tags:Attention mechanism, Multi-attention mechanism fusion, Image Chinese content generation, Deep learning
PDF Full Text Request
Related items