Font Size: a A A

Research On Named Entity Recognition For Architectural Texts

Posted on:2024-01-10Degree:MasterType:Thesis
Country:ChinaCandidate:M Y WangFull Text:PDF
GTID:2531307091980959Subject:Information management and information systems
Abstract/Summary:PDF Full Text Request
In recent years,the frequency of production safety accidents in housing and municipal engineering and the fatality rate of large and above production safety accidents continue to be high.In order to strengthen the management of production safety in housing and municipal engineering,it is necessary to carry out statistical analysis on the relevant information of accidents.At this stage,the collation of the production safety information of these housing and municipal projects is still in the manual stage.At present,there is little research on information mining of production safety accidents of housing and municipal engineering.How to mine more useful and more beneficial information from the information in this field is of great significance.The main research content of this paper is the research of named entity recognition for the text in the construction field,in which the text in the construction field is the text of the production safety accident of the housing municipal engineering.As an important part of production safety management,the analysis of production safety accidents of housing and municipal engineering has a low reuse rate of production safety knowledge information in the production safety accident report,and the reference provided for production safety management is not sufficient.Based on the analysis of the number of accidents,the number of deaths,the type of accidents,the changes of accidents year by year and month by month,and the word cloud of the accident text of the production safety accidents of the housing municipal engineering from 2009 to 2020,this paper finally determined to collect the report of the production safety accidents of the housing municipal engineering from 2010 to 2022 as the experimental corpus for the entity identification study,and determined the accidents of each year included in the more than 1200 accident reports The approximate proportion of accidents and types of accidents in each month is intended to make the experimental corpus more comprehensive,reasonable and sufficient,and improve the reusability of accident analysis results and the adequacy of reference for production safety management.As one of the tools for structured storage and reuse of knowledge,knowledge atlas can quickly retrieve the cases of production safety accidents of housing and municipal engineering,analyze and count the accident correlation paths,and improve the level of production safety management of housing and municipal engineering.According to the application requirements of the knowledge atlas of production safety accidents in housing and municipal engineering,this paper analyzes the text content of the experimental corpus to define the concept of seven entity categories,defines the method of entity labeling,introduces the tools of entity labeling,and finally independently constructs the data set for the research of named entity recognition of production safety accidents in housing and municipal engineering.In this paper,RoBERTA model based on whole word mask is used for pre-training to obtain dynamic word vector,bi-directional long and short term memory network BiLSTM model obtains context semantic information to obtain word label score vector,conditional random field CRF model outputs the optimal label sequence of entities,which combines the advantages of the three models,Finally,the RoBERTawwm-BiLSTM-CRF combination model is proposed for the research of named entity recognition for the production safety accident text of housing and municipal engineering.In order to train the model and verify the effect of its body recognition,this paper sets up four parts of experimental content,including the reasonable size of the dataset,the reasonable proportion of the dataset,the overall entity recognition effect and the entity recognition effect,and compares and analyzes the experimental results.The final experiments show that on the RoBERTawwm-BiLSTM-CRF combined model,the F value reaches 89.669%and the recognition effect is the best when using1200 data scales and 8:1:1 scale division to identify the overall entity of the self-built data set of housing and municipal engineering production safety accidents.
Keywords/Search Tags:Production safety accidents of housing and municipal engineering, knowledge graph, Accident text, Named entity identification, Whole word mask
PDF Full Text Request
Related items