Font Size: a A A

Research On Entity Relation Extraction Of Geological Disaster Text

Posted on:2022-11-01Degree:MasterType:Thesis
Country:ChinaCandidate:Y T ZhouFull Text:PDF
GTID:2480306779996089Subject:Computer Software and Application of Computer
Abstract/Summary:PDF Full Text Request
The task of entity relation extraction in geological disaster text is designed to automatically extract knowledge triples from large-scale unstructured text.In order to realize the structuring of geological disaster text,the machine must not only accurately identify the entity boundary in the text,but also accurately judge the relationship between entity pairs in combination with the characteristics of the geological field.This paper conducts a systematic study on the structuring of geological disaster text information,and investigates the current status of triple knowledge extraction technology and geological big data services at home and abroad.At present,the solution of this task is mainly based on the pipeline method,which first performs entity recognition and then completes relationship classification.With the strong momentum in the development of Chinese pre-training models,the performance of entity-relation extraction tasks is greatly improved.However,there are still some problems in the application of geological disaster texts,such as omission of entities,blurred entity boundaries,and error propagation.In response to these problems,on the basis of dependency parsing and deep learning,the main work of this thesis is as follows:1.Aiming at the problem of blurred entity boundary in geological disaster text,an entity relation extraction method based on core verb chain is proposed.The method first uses the dependency syntax analysis technique to extract the core verbs of the sentence,and uses the core verbs as the relative words of the triplet to find the main entity and tail entity of the triplet forward according to the dependency relationship.And a default component completion module is designed,making full use of parts of speech and dependencies to capture entity boundaries.On the self-built geological disaster data set and COAE2016 dataset,the average F-score of core verb extraction is 95.45%,and the average F-score of triplet extraction is 82.70%.2.Aiming at the problem that the pipeline method is easy to cause error propagation,a Transformer-CRF method based on multi-feature fusion is proposed for the joint extraction of geological disaster entity relations.To fully extract semantic features,sparse and dense features of sentences are fused as feature representations.In order to improve the performance of the joint extraction model,the entity labeling method and extraction rules are designed,and the location information of the entity is added to the label,which improves the accuracy of entity recognition.A two-layer Transformer is designed to encode the context of sentences,making full use of semantic connections.On the self-built geological disaster data set,the Transformer-CRF entity-relationship joint extraction model based on multi-feature fusion achieves a F-score of 74.68% for named entity recognition and a F-score of 62.31% for relation extraction.
Keywords/Search Tags:geological disaster text, entity relationship extraction, named entity recognition, dependency parsing
PDF Full Text Request
Related items