Font Size: a A A

Research Of Named Entity Recognition Based On Hybrid Cascade Model

Posted on:2017-01-06Degree:MasterType:Thesis
Country:ChinaCandidate:D Y JiaFull Text:PDF
GTID:2348330542987006Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
With applying the Internet,cloud computing,mobile media,Internet of things,and other emerging networks,a series of Internet-deriven applications,such as search engines,e-commerce,social network web sites,have developed rapidly,which leads us to the era of big data.To massive data,not all information is useful,so there is an urgent necessaryon some automated tools to helpprocessing and identifying the valuable data.The techniques of named entity recognition wereproposed in this enviroment.It plays an important role in information extraction,information retrieval,machine translation,quiz systems,and various applications of natural language processing.This thesis works on the analysis and researchs of person name entity recognition,toponymy entity recognition and organization name entity recognition based on a large number of real worldpaper data.According to the analysis,this thesis proposes a statisticalrecognition model with adaptive mode.The model combines the characteristics of both hidden Markov model and conditional random field model.This model firstly samples the testing corpus by using the sampling method.Then it tests the extracted corpus with the two statistical models of hidden Markov model and conditional random field model.After analyzing the results of test,it can adaptively get a more efficient statistical model for identifying a certain type of entity on the training dataset and the current environment.Then it applies the efficient statistical model on the whole test corpus to recognise named entities.Next,this thesis proposes an improved entity recognition model by combining rules and statistical methods.This model takes in the rule-based method of recognition,besides the adaptive statistical model of recognition.This model is mainly designed to correct errors that are frequntly occurred in the statistical model.Then,we propose a hybrid cascade model for named entity recognition model.This model has three layers.From the bottom to the top,they are the person name entity recognition,toponymy entity recognition and organization name entity recognition.Each layer of the model applys the improved entity recognition model with the combination of rules and statistical methods.The recognition results of each layer will be added to the rule base which will be used in higher level models.The cascade model not only combines the advantages of different statistical models,but also takes advantage of the characteristics that entities are nested with each other.This model greatly improves the accuracy of the recognition of toponymy and organization names.Finally,experiments demonstrate the significant improvement on precision and recall with the hybrid cascade recognition model proposed by this thesis by comparing it with the existing entity recognition models.Moreover,by with using the hybrid cascade model for named entity recognition proposed by this thesis,it can find that this recognition model has both research significance and practical value.
Keywords/Search Tags:Named entity recognition, Natural language processing, Hidden markov model, Hybrid cascade model
PDF Full Text Request
Related items