Font Size: a A A

The Research Of Automatic News Image Insertion Based On Multi-modal Neural Network

Posted on:2023-11-19Degree:MasterType:Thesis
Country:ChinaCandidate:K WangFull Text:PDF
GTID:2568306836469414Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
These pictures in news reports catch people’s eyes,arouse readers’ interest in reading news content,and also intuitively convey the content of news reports.This dissertation explores how to apply artificial intelligence technology to the field of picture news,hoping to help news editors complete the work of matching pictures for news with the help of the power of machines.Firstly,this dissertation studies the sequential insertion of pictures in news reports.This dissertation proposes a neural model based on the multi-modal RNN equipped with an encoding updating mechanism to insert images into proper positions of a news document sequentially.For each image,our model selects the position of which the surrounding sentences are most relevant to the image.Two methods are employed to score the relevance between the image and surrounding sentences: the one employs a layer of convolution network before scoring,and the other employs the pooling operation after the scoring.It is reasonable to believe that inserting an image in news will render the information of the news.This dissertation hence introduces an updating mechanism to update the encoding of the news document after the image is inserted.This dissertation creates the dataset for the problem by extending the Daily Mail corpora and carry out experiments.Experiments show that the proposed model outperforms four baselines including the pointer network according to two evaluation metrics.Then,this dissertation studies the subject of picture insertion when it is unknown whether the picture is related to the news article.In this case,some candidate pictures cannot be inserted into the article.Therefore,this subject first decides whether pictures can be inserted into news articles.After determining that the insertion is feasible,the insertion location is predicted based on the correlation between the picture and the sentences in the text.This dissertation introduces a new method calculating the similarity between pictures and articles,which takes into account the relationship between pictures and the whole article,as well as the relationship between pictures and each sentence in the article.Since both picture filtering task and picture insertion task need to compute the similarities between pictures and sentences in the text,this dissertation puts these two tasks together for multitask learning,expecting to solve multiple tasks at the same time after one training.The experiments show that the combined training learns a better model than the individual training for both tasks.Therefore,the model of multitask learning reduces the cost of learning and improves the accuracy of the model.Experiments on datasets show that the methods presented in this dissertation are feasible.Compared with the baseline models,the model in this dissertation achieves the best results.The results of this study provide a new way to study the problems related to the matching of pictures with long articles,and can also be applicated in the practical scenario of inserting pictures into long articles.
Keywords/Search Tags:Multi-modal Neural Network, Text Update, Image Locating, Image Filtering, Multi-task Learning
PDF Full Text Request
Related items