Font Size: a A A

Research And Implementation Of A Document-Level Relation Extraction Technology Integrating Heterogeneous Graph And Homogeneous Graph

Posted on:2023-11-27Degree:MasterType:Thesis
Country:ChinaCandidate:T C LiFull Text:PDF
GTID:2568306914456584Subject:Computer technology
Abstract/Summary:PDF Full Text Request
With the development of the Internet and artificial intelligence technology,the entity relationship extraction technology for extracting knowledge information from text data is becoming more and more mature.Most of the current relationship extraction methods focus on how to extract entity-relation triples from sentence-level text data.However,most of the text data on the Internet are in the form of paragraphs or chapters.Therefore,a lot of research work related to document-level relation extraction has appeared in recent years.The purpose of document-level relationship extraction is to extract the relationship between entity pairs in the document,focusing on solving the cross-sentence relationship that cannot be solved by sentence-level relationship extraction and the relationship that requires reasoning to judge.Graph neural network technology has been widely used in various fields of NLP and achieved good results.This technology is very suitable for document-level relation extraction tasks.By representing entities and sentences in text as nodes and constructing graphs Construction,and then feature propagation on the graph,can establish connections within the document,thereby enhancing the ability to identify cross-sentence relationships and inference relationships.This thesis uses graph neural network technology to conduct the following research work on document-level relation extraction tasks:1)This thesis proposes the use of evidence sentence information in relation classification.Evidence sentences are sentences that contain relational information in the document,and sentences that do not contain relational information are called noise sentences.Most of the current work related to document-level relation extraction ignores the information of evidence sentences,uses all sentences in the entire article when judging the type of relation,and introduces those noise sentences that do not contain relation information,which affects the judgment of relation type.To address this issue,this thesis proposes a multi-task joint model that can identify evidence sentences and extract relations based on the evidence sentences.The experimental results show that adding the evidence sentence prediction module can avoid the influence of noise sentences in relation classification,and can effectively improve the performance of the document-level relation extraction model.2)The data extracted from the document-level relationship is a document.Unlike traditional sentence-level data,the text length of document-level data is relatively long.The traditional model used to encode text has better performance when the input text is long.There will be a big drop.At present,the mainstream method to solve the problem of too long input text is to use graph neural network.This thesis conducts indepth research on graph neural network to solve the problem of understanding long text content,and designs homogeneous graph,heterogeneous graph and two types of graph networks.Three different types of graph propagation modules and comparative experiments are carried out.3)This thesis also builds a visual entity relation extraction prototype system.The background of the system can automatically crawl the Internet text data and store it in the database.Through this prototype system,the model can be trained and managed,and the corresponding model can be selected to extract the entities and relationships in the text data and display them on the page.
Keywords/Search Tags:document-level relation extraction, evidence sentence, graph neural network, joint model
PDF Full Text Request
Related items