Font Size: a A A

Named Entity Recognition For Judicial Document Data

Posted on:2024-03-07Degree:MasterType:Thesis
Country:ChinaCandidate:J Y GuoFull Text:PDF
GTID:2556307115957579Subject:Computer technology
Abstract/Summary:PDF Full Text Request
In recent years,information extraction tasks for wisdom justice services have gradually attracted widespread attention from academia and enterprises,and with to extract named entities from judicial documents,the “Chinese AI & Law Challenge(CAIL)” has taken the lead in launching the judicial information extraction task in 2021.As an important subtask of information extraction,the judicial named entity recognition aims to recognize key entities in cases from judicial documents,thus providing help to realize a series of tasks such as judicial judgment prediction,judicial knowledge graph construction,and similar cases retrieval.In order to express the details of the case more clearly,legal documents often use nested named entities to describe legal facts.However,existing NER methods usually use the traditional sequence labeling model,which is difficult to effectively recognize the nested named entities in legal texts,greatly affecting the effectiveness of JNER.Based on this,this paper introduces the MRC framework into the NER in the judicial field,and proposes a judicial nested named entity recognition method with MRC framework,which effectively solves the problem of nested entities in legal texts.Meanwhile,to further solve the problem that the method cannot effectively learn the information of relevant entities in the question set and has low learning efficiency,this paper explores the NER method that integrates legal fact description and question set.The main work of the paper includes:Firstly,we propose a judicial nested named entity recognition method with MRC framework.First of all,we design the question template according to the characteristics of judicial nested named entities,and construct the legal text named entity dataset in MRC format.Next,we introduce the span extraction MRC model based on the pre-trained to encode the question and text,and learn the context knowledge of the entity in the question.Finally,we extract the starting and end positions of the matching span respectively through two classifiers,to get the corresponding entities.The experimental results on the information extraction dataset in “CAIL 2021” show,compared with the existing baseline models,the proposed method effectively improves the recognition effect of nested entities commonly existing in the judicial field.Secondly,a named entity recognition method that fuses legal fact descriptions and question sets is explored.To begin with,the legal fact descriptions and the entity-based question sets are encoded separately and independently,and then the obtained legal fact descriptions and question set representations are input to the semantic fusion module based on the attention mechanism,to explicitly fuse the entity information in the question set into the legal fact description representation.At last,we extract the entities through two classifiers.The experiment results show that this method can effectively improve the model prediction.Thirdly,a judicial named entity recognition system was designed and implemented.Based on the previous chapters,a NER system for the judicial domain was completed.
Keywords/Search Tags:Named entity recognition, Nested named, Wisdom justice, Machine reading comprehension, Attention
PDF Full Text Request
Related items