Font Size: a A A

Research On Entity Identification And Relationship Extraction For Legal Documents

Posted on:2022-03-07Degree:MasterType:Thesis
Country:ChinaCandidate:X X ZhangFull Text:PDF
GTID:2506306485459334Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
The rapid development of the Internet,the information contained in a large number of texts also increases rapidly.It has become an research focus to structure and visually display such information,which leads to the rise of knowledge graph.According to the different fields of research and use,knowledge graph has been extensively studied in specific fields.Named entity recognition and relation extraction are the most basic work in the construction of knowledge graph.In view of the applicability and lack of data of domain knowledge graph,this paper takes the knowledge acquisition module of legal domain knowledge graph as the basis to explore the person entity recognition and relationship extraction method in the knowledge graph of legal domain.The legal marriage civil judgment document is taken as the research data,and the text length is long and the name entity is special.Therefore,the algorithm model is adjusted from the two aspects of name recognition and relationship extraction to improve the extraction effect of the model.The main tasks are as follows:The Bi LSTM-CRF algorithm is used as the criterion to study how to improve the effect of character entity recognition in legal documents.This paper aims at the inadequate training of network model and inaccurate segmentation caused by the small amount of data,cannot be understood further contextual semantic information.In this paper,a pre-training model is added and Attention is integrated into the BILSTM algorithm.Through proper parameter setting,the value of BERT model is increased by 10.92% compared with the baseline model,and also slightly improved after integrating the attention mechanism.As for the extraction of personal-relationship in the legal field,this paper predefines 10 kinds of fine-grained family personal-relationship that often appear in marriage civil judgment documents,so as to regard relationship extraction as a classification problem.The pre-training model BERT was used to extract the representation of sentence vectors with features such as words,sentence segments,positions and semantics,etc.,and the relation extraction was carried out as the input of Bi GRU-attention.The classification results reached 93.49%.Subsequently,the experimental results of the relationship extraction method with Bi LSTM as the main line and previous improved methods in the data in this paper are compared.The proposed method is superior to other methods.For the visual display of character relationship,based on the triplet data of character relationship in legal documents,data storage and visualization based on Neo4 j graph database.The extracted data was imported into the graph database in CSV format,and display the character relational network.
Keywords/Search Tags:Pre-training model, Entity recognition, Relation extraction, Neo4j
PDF Full Text Request
Related items