Font Size: a A A

Research On Named Entity Recognition In Clock Domain Based On Deep Learning

Posted on:2022-01-14Degree:MasterType:Thesis
Country:ChinaCandidate:H WangFull Text:PDF
GTID:2492306491453074Subject:Master of Engineering
Abstract/Summary:PDF Full Text Request
At present,the clock system has been applied to national defense,economy,finance,industry,communication and other fields,and the complexity of the clock system is getting higher and higher,which brings many industry problems to the design of the clock system scheme,production and manufacturing,after-sales service,after-sales question and answer,and the application of knowledge graph technology in the field of clock is conducive to solving the problem of industry pain points.By analyzing structured and unstructured texts in the clock field,building a knowledge graph in the clock field can assist clock professionals in providing intelligent solutions when facing different users,which can satisfy users in many aspects.Needs,reduce costs,and provide basic support for subsequent knowledge discovery.Among them,named entity recognition is a basic and even key step in building a knowledge graph.Therefore,it is of great significance to study how to improve the entity recognition effect of text in the clock domain.In response to the current problems in the clock domain,this article focuses on the clock domain text and deep learning network model for in-depth discussion,and carried out the following innovative research.(1)In view of the lack of label data sets in the clock field,define the entity categories of the clock domain,design the auxiliary labeling platform,and build a high-quality clockdataset data set.New word discovery algorithms using mutual information and left and right adjacent entropy are used to find new words,analyze the professional terms in the field of clock,and define the categories of entities in the field of clock.Select entity labeling policies and specifications that are appropriate for this task,design an auxiliary labeling platform,and efficiently build high-quality clock-dataset to lay the foundation for subsequent naming entity identification.(2)In view of the problem of entity nesting and small number of label samples,a BERT-LCRF clock domain named entity recognition model is proposed.The pre-trained language model BERT is used to extract the characteristics of the text of the clock field,and then the linear chain conditional random field(Linear-CRF)method is used for sequence labeling.The comparative experimental results show that the model can fully study the characteristic information in the field of clock,improve the accuracy of sequence labeling,and then improve the recognition effect of named entities in the field of clock.(3)Design and implement the clock domain entity recognition system,provide an interface for enterprise call use.The platform realizes the functions of data pre-processing,entity category definition,auxiliary labeling,model training,testing and evaluation.It can not only meet the needs of clock professionals for data analysis,construction of label data sets and entity recognition,but also lay a solid foundation for the construction of subsequent knowledge graph,which fully proves the practicality and validity of the model proposed in this paper.In summary,the method proposed in this paper can further enhance the effect of the task of naming entities in the clock domain,solve the problems encountered in the field of clock,provide a feasible method for the entity recognition technology in the field of clock,and finally lay a solid foundation for the construction of knowledge graph.
Keywords/Search Tags:Named entity recognition, Conditional random fields, Self-attention model, BERT pre-trained language model
PDF Full Text Request
Related items