Knowledge Extraction And Reuse In Wikipedia

Posted on:2010-04-26

Degree:Master

Type:Thesis

Country:China

Candidate:H J Zhang

Full Text:PDF

GTID:2178360275470239

Subject:Computer application technology

Abstract/Summary:

PDF Full Text Request

With the emergence of Web 2.0, collaborative authoring systems em-brace the power of collective intelligence and have been widely adopted for knowledge management. The wiki is a popular example of such systems. One of the best-known wikis is Wikipedia, the largest free online encyclope-dia authored by a broad community of volunteers. Wikipedia also qualifies as a potential semantic data source for its broad knowledge coverage, well-defined information structure and dynamic evolvement with the change of world knowledge. Semantic wikis aim to enhance wikis by Semantic Web technologies via adding explicit semantics to wiki entities.While the freedom in collaborative wiki contributes to the success of Wikipedia, it also creates problems. In particular, it results in a large number of missing and noisy annotations, which affect the quality of the content and impede the terminology convergence. Currently, low quality annotations have to be addressed by a small group of experts, which becomes a bottle-neck. Meanwhile, these experts are also the most active contributors who contribute the most of edits, which leads to a heavy burden on them. The Semantic Wikis face the similar problem: lack of annotated semantics and semantic annotators. Specifically, for casual users, in order to edit hi-gi-quality articles that have meaningful relationships with the rest of the collection, users are required to have much knowledge about the collection and also to understand the underlying semantic technologies. They need to know:1) When is it necessary to provide a hyperlink to a target entity of a related topic for reference? How to locate the target? 2) What categories are proper to characterize an article?3) What infobox can be used to model the properties of an article?4) Is there any implicit relationship between entities when editing Se-mantic Wikipedia? If so, how to annotate it?In this thesis, we try to help user answer these questions via knowledge extraction and knowledge reuse. Here, knowledge extraction is the pre-step of knowledge reuse, which is performed based on the extracted background knowledge. We are inspired by collaborative filtering research that uses the ratings from other like-minded users to calculate recommenda-tions for the active users. Similarly, we reuse the collective knowledge by annotation suggestion for Wikipedia authoring.To accomplish this goal, we first extract meaningful knowledge from the data currently annotated in Wikipedia as our background know-ledge, which can be structural and semi-structural semantic features of Wi-kipedia entities including entity thesaurus, entity types, and semantic rela-tionships between entities. And then we propose a unified annotation suggestion algorithm framework to exploit such extracted knowledge and apply our knowledge reuse solution to Wikipedia authoring.We present our prototype system named EachWiki that provides the following annotation suggestion services for users: link suggestion, category suggestion, infobox suggestion, and relation suggestion, in which way, the collective intelligence is leveraged. The above suggestion services can not only help users create high-quality Wikipedia knowledge, but also help brick Semantic Wikipedia. Finally, the experimental evaluations of every sugges-tion modules prove the effectiveness, efficiency, and usability of our ap-proaches.

Keywords/Search Tags:

Wikipedia, Knowledge Extraction, Knowledge Reuse, An-notation Suggestion, Relation Suggestion

PDF Full Text Request

Related items

1	Classification Model And Text Enhancement For Suggestion Recognition
2	Research On Relation Extraction And Its Application In Knowledge Graph Construction
3	Research On Key Technologies Of Suggestion Mining And Generation
4	Internet-oriented Music Suggestion Based On Associated Rules Mining
5	Research On Knowledge Extraction And Management Of The Big Data In Wikipedia
6	The Design And Implementation Of An N-ary Relation Extraction System Based On Knowledge Graphs
7	Research On Key Technologies Of Knowledge Graph Costruction For The Knowledge Field Of Ship
8	Knowledge Base Construction Based On Chinese Wikipedia
9	Mining Semantic Knowledge From Chinese Wikipedia
10	Relation Extraction For Industry Knowledge Graph Algorithm Research And Application