Font Size: a A A

Research On Pictorial Summarization Of Events

Posted on:2017-06-29Degree:MasterType:Thesis
Country:ChinaCandidate:Y WangFull Text:PDF
GTID:2348330488483517Subject:Software engineering
Abstract/Summary:PDF Full Text Request
Currently, social media is one of the most popular applications on the Internet. It’s an important platform with a huge volume of users and contents. When a hot event occurs, it will attract attentions of users in social media, resulting in a rich content related to this event (such as micro-blogging, comments, etc.). In terms of data form, it contains text, images and video and so on. Because of its interactivity and openness, data generated by social media is usually full of noise and redundancy, causing it difficult to extract the main aspects and views of human. In this paper, we develop an automated method to generate pictorial summary of events, so that the users can obtain the information corresponding to an social event easily, quickly and accurately.Based on the summary and analysis of existing related work, we propose the definition of event in social media and formal description of this problem, by combining concepts and researches in the field of text information processing, multimedia content and social media content.Since images and texts are two different heterogeneous media, we propose an image clustering method based on kernel canonical correlation (KCCA) for a better match between them. By combining visual features of CM, GLCM and HOG, and LDA text modeling, we achieve image clustering on our own data by first establishing correspondence relations between images and texts through Image Retrieval Dataset Div400. We propose some effective methods of data preprocessing aiming at multimedia data in the session of experiment, and obtain clustering accuracy rate of 80.60% and 72.82% on our own data and Div400 respectively.In Chapter 4, we present guidelines and quantitative indicators of representative images, which are Representative, Context and Coverage. We extract keywords using an improved algorithm of TextRank and perform the process of graphic match by combining the visibility model and the traditional model of tf-idf. Finally, we attempt to analyze the structure information about the event from different aspects and different user views using some tentative approaches.The last chapter summarizes the content of full paper, and identifies several possible subsequent research questions.
Keywords/Search Tags:social media events, Kernel Canonical Correlation Analysis, clustering algorithm, summarization generation
PDF Full Text Request
Related items