Font Size: a A A

A New Image Captioning Algorithm

Posted on:2024-09-29Degree:MasterType:Thesis
Country:ChinaCandidate:Y YaoFull Text:PDF
GTID:2568307067996509Subject:Applied Statistics
Abstract/Summary:PDF Full Text Request
Image Captioning aims to build a model by computer,input the extracted image features into the language generation model,and finally output the corresponding natural language description.It is an interdisciplinary problem in the intersection of Computer Vision,Natural Language Processing and Artificial Intelligence,and is the key task to further study the intelligent image understanding that accords with human perception.The current mainstream image captioning algorithm adopts the encoder-decoder framework,in which the encoder is used to extract the visual information in the image and send it to the back-end decoder to generate the corresponding natural language description.In this paper,based on this framework,a new image captioning algorithm is proposed,which successively introduces the Mask R-CNN model of the object instance segmentation framework and the Transformer model in the field of machine translation as the image encoder and language decoder respectively.After the model is trained to stability through the cross-entropy loss,the self-critical sequence training algorithm is adopted to further optimize the model with the aim of improving the model’s score on the CIDEr index.Based on the Microsoft COCO dataset,the validity of the proposed algorithm is proved by the comparison of different models.
Keywords/Search Tags:Image Captioning, Encoder-Decoder, Object Detection, Reinforcement Learning
PDF Full Text Request
Related items