Research Of Ontology-Based Information Extraction

Posted on:2008-09-13

Degree:Master

Type:Thesis

Country:China

Candidate:J Chen

Full Text:PDF

GTID:2178360218951203

Subject:Computer application technology

Abstract/Summary:

PDF Full Text Request

The research on Information Extraction aims at providing more powerful information access tools to help people overcome the problem of information overloading. As the common knowledge of a special domain, ontology could deal with the challenging task in information extraction----the bottleneck of knowledge engineering.This thesis explores Top-Down and three-lever Ontology framework structure to establish the Ontology of the thesis, namely Professor's CV Ontology (PCV), where its concepts are divided as event concepts and extended concepts. In this process, the instances of concept are gained by calculating semantic similarity based on WordNet and collecting manually, so as to fulfill a sound Ontology with integrated concepts,relations and instances.This thesis proposes an approach to process Information Extraction based upon Ontology and classification, in which concepts, relations and instances in the Ontology are utilized, and the content under-extracted lies on the elements of Ontology. We first introduce special concepts and theirs instances in the Ontology into the document preprocessing, and pick out special instances in documents. Then we extract information from documents in a hierarchical way. According to features of the processing document, the sentences in it are classified and their event types are determined in advance. The standard of sentence event type roots in event concepts in Ontology. Consequently we can obtain corresponding extended concepts and properties, both of which could be used to establish extraction templates. Afterwards, instances are going to be extracted directly by combing label results and templates.The content under-extracted will be identified by concepts and relations in Ontology, which promises the consistency of both structure and data respectively. And,it would reduce manual operation greatly to label training data by converting extraction into classification. The experiments show that the new approach has a good performance on information Extraction.

Keywords/Search Tags:

Information Extraction, Ontology, Sentence Categorization, RIPPER, GATE, ANNIE, Semantic Similarity

PDF Full Text Request

Related items

1	Research On The Calculation Method For Semantic Similarity Of Sentence And Its Application
2	Research On Automatic Question Answering System Based On Ontology
3	Research On The Search Technology Of Geographical Information Based On Semantic Similarity
4	Research And Realization On Text Similarity Base On Ontology
5	Research On Semantic Annotation Approach Based On Complex Sentence Ontology
6	Research On Ontology-Based Semantic Text Categorization
7	Research On Semantic Similarity Matching Algorithm Of Questions Based On Deep Learning
8	Research On The Method Of Text Categorization Based On Semantic Similarity
9	Study On How Net Ontology Based Text Categorization Algorithm And It's Application
10	Research Into The Method Of Information Extraction Based-on Sentence Clustering