Research On Entity Recognition And Entity Relation Extraction Of Metro Design

Posted on:2022-03-02

Degree:Master

Type:Thesis

Country:China

Candidate:Y N Yao

Full Text:PDF

GTID:2492306512976369

Subject:Computer application technology

Abstract/Summary:

PDF Full Text Request

With the continuous improvement of public infrastructure,subway has gradually become the primary choice of daily travel tools.Subway engineering construction includes planning,design,construction and trial operation,among which design is the key to ensure the quality of engineering construction,and also the important premise to ensure the safety,economy and use of subway.The subway design code is an important document to restrict the subway design,and it is the result of many years of experience precipitation and repeated demonstration research in China.In order to accelerate the process of information and intelligence in this field,this paper carries out information extraction for metro design code text,mainly including entity recognition and entity relationship extraction.The specific research contents are as follows:(1)The corpus construction of metro design code.At present,the research on entity recognition and entity relation extraction in metro design field is in its infancy,and the existing research has not proposed and published the information extraction corpus.This paper analyzes the code text,and gets the entity types and relationship types in this field,as well as the sublanguage characteristics of the code text.At the same time,the group annotation is used to tag some code texts.The tagging process follows the semi manual closed-loop principle of "generating data sets-training benchmark model-analyzing prediction errors-formulating data update strategy-updating data sets",and constructs the information extraction corpus based on the metro design code.(2)Named entity recognition method based on vocabulary enhancement technology and pre-training mechanism.Firstly,the training is based on the BiLSTM-CRF entity recognition model,which represents the text by the classical sequence annotation method.The bottom layer of the model encodes the characters,but the vocabulary information usually plays a vital role in the entity boundary.To solve this problem,this paper designs a dynamic framework SW-BiLSTM-CRF compatible with word input,including word boundary and word embedding information,so as to enhance the vocabulary of the model.At the same time,with the help of pretraining mechanism,the context data features in large-scale unsupervised corpus are transferred to the model training process.The unsupervised training process includes two stages:open domain pre-training and in-depth pre-training of 800000 building domain code texts to obtain BcBERT,and then the named entity recognition task is fine tuned.Experimental results show that BcBERT-SW-BiLSTM-CRF model can effectively improve F1-measure.(3)Entity relation extraction method based on average-pooling and attention enhancement.Firstly,the code text sequence is represented based on the BcBERT,and the entity information in the text is obtained through average-pooling.At the same time,the relative position information of entities is used to enrich the word-based attention.Finally,the multi-relation prediction results between multi-entity-pairs are obtained through a specific output structure.In the process of the experiment,a number of control experiments were set up to illustrate the efficiency of the method from the perspectives of prediction results and running time.

Keywords/Search Tags:

Design code, Named entity recognition, Entity relation extraction, BiLSTM, BERT

PDF Full Text Request

Related items

1	Research On Named Entity Recognition Method Of Rail Transit Code
2	Research On Named Entity Recognition In Clock Domain Based On Deep Learning
3	Research On Military Named Entity Recognition Method Based On Pre-training Language Model
4	Research On Key Technologies Of Named Entity Recognition For Rail Transit Code
5	Research And Implementation Of Named Entity Recognition And Entity Linking For Fault Report In Power Grid
6	Research And Implementation Of Knowledge Extraction Method For High-speed Rail Intelligent Operation And Maintenance
7	Intelligent Text Classification And Entity Recognition Of Inspection Text For Water Transmission Project
8	Research On New Energy Vechicles Named Entity Recognition Based On Multi-feature
9	Named Entity Recognition Of The Code For Geology Investigation Of Railway Engineering
10	Research On Key Technologies Of Bridge Inspection Text Information Extraction Based On Deep Neural Networks