Research On English Named Entity Extraction

Posted on:2017-09-05

Degree:Master

Type:Thesis

Country:China

Candidate:Y Xue

Full Text:PDF

GTID:2415330590988952

Subject:Applied statistics

Abstract/Summary:

PDF Full Text Request

Named Entity Extraction is the foundation of several natural language processing tasks,such as Machine Translation,Automatic Question and Answering,as well as Coreference Resolution.English entity extraction has been studied for more than 10 years,but the goal of entity extraction is limited in Person,Location and Organization.It is of great significance to increase the entity category and improve the accuracy of the entity recognition.In this paper,firstly,we use the entity linking method to prove that Wikipedia can not cover all entities in the real world,so it can't meet people's needs of searching and understanding of knowledge.Then we compared the performance of several models,including the Hidden Markov Model,Maximum Entropy Model,Conditional Random Fields Model,Noun phrase recognition model,Stanford University named entity recognition,Microsoft named entity recognition,Typeless entity recognition model.Through the empirical analysis of the data,it is proved that Typeless NER model,which is removed the entity label and trained by conditional random fields is the best one.In order to avoid the impact of the training data and test data's structure similarity to the experiment results,we also selected five days news data in 2014,as well as the short text data from Microsoft to validation our conclusion.The data of this paper is millions level,and it is primarily on the web documents,which covers great variety of writing styles and different signals,so the method we propose is pretty robust in different language domain.This paper provides a new attempt for the English entity extraction,the proposed model T-NER has been put into practical use now.

Keywords/Search Tags:

entity extraction, conditional random fields, Typeless NER, knowledge graph

PDF Full Text Request

Related items

1	Research On Technologies Of Knowledge Graph Construction In Cultural Relics
2	Research On The Named Entity Recognition And Base Noun Phrase Identification
3	The Construction And Analysis Of The Knowledge Graph Of Mongolian Historical Figures
4	Research And Application Of Entity Aligning Method For Chinese-Korean Bilingual Knowledge Graph
5	Research On Named Entity Recognition And Knowledge Graph Construction Of Chinese Classical Literature Texts
6	Research On The Construction And Application Of Knowledge Graph In The Field Of Culture And Entertainment
7	Research On Construction And Application Of Knowledge Graph Of Minority Festivals
8	Research And Implementation Of Thangka Character Question Answering System Based On Knowledge Graph
9	Research And Application Of Construction Method Of Historical Cultural Relic Knowledge Map Based On Deep Learning
10	Desigh And Implement Of Parser Based On Grammar Function And Collocation