Font Size: a A A

Research On Named Entity Recognition For Hazardous Chemical Storage Technology

Posted on:2024-04-13Degree:MasterType:Thesis
Country:ChinaCandidate:S Q WangFull Text:PDF
GTID:2531307139477334Subject:Materials and Chemical Engineering (Professional Degree)
Abstract/Summary:PDF Full Text Request
Effective and safe storage of hazardous chemicals has always been an important aspect of the chemical industry.How to help chemical companies choose the appropriate storage technology to store hazardous chemicals and ease the pressure of supervision and accident handling in the safe storage of hazardous chemicals? We can assist staff in selecting storage technologies by establishing a knowledge map in the field of hazardous chemicals.The named entity recognition task is an important prerequisite for building a knowledge graph.Identify the risk and hazard information of hazardous chemicals in the text through named entity recognition tasks,and assist practitioners in selecting appropriate storage technologies for storing hazardous chemicals.However,the existing research on named entity recognition in the field of hazardous chemicals faces problems such as the lack of domain-related public corpora,the variety of hazardous chemical entities,the blurred boundaries of entities,and the polysemy of entity words..In response to the above problems,this thesis proposes two different pre-training models to identify key information in hazardous chemical texts and solve the problems faced by named entity recognition in hazardous chemical texts.The main contributions of this thesis are:1.Based on the needs of the hazardous chemicals business field,a hazardous chemicals data set consisting of 2800 hazardous chemicals was constructed,which marked the hazardous chemicals,risks,risk generation conditions,post-risk products,and Five types of entities such as danger.It solves the problem of a lack of data sets in the field of hazardous chemicals and lays the foundation for subsequent research.2.Based on the RoBERTa_wwm_ext-BiLSTM-CRF neural network model,this thesis proposes to apply the RoBERTa_wwm_ext pre-training model to the named entity recognition of hazardous chemicals for the first time,which solves the problems of polysemy and blurred entity boundaries in the text of hazardous chemicals.Compared with experiments,the model has achieved good results.3.In view of the low effect of some entity recognition of the RoBERTa_wwm_ext-BiLSTM-CRF neural network model,there will be inaccurate information recommendation problems,which may lead to inaccurate information recommendation and affect the selection of hazardous chemical storage technologies.Therefore,this thesis proposes to use the out-of-order language model PERT to obtain semantic information in the form of word order reversal.Then fuse the graph neural network model Bi GRU and CRF.The experimental results show that the F1 of the model proposed in this thesis reaches 94.18,and the problem of low efficiency of some entity recognition has also been effectively solved.
Keywords/Search Tags:Named entity recognition, Hazardous chemicals, RoBERTa_wwm_ext, Pre-training model, PERT
PDF Full Text Request
Related items