Font Size: a A A

Research On Element Identification For Legal Documents

Posted on:2021-04-09Degree:MasterType:Thesis
Country:ChinaCandidate:D X WangFull Text:PDF
GTID:2416330626455424Subject:Computer technology
Abstract/Summary:PDF Full Text Request
In recent years,with the continuous development of natural language processing technology and the continuous disclosure of judicial big data represented by judgment documents,applying artificial intelligence technology to the judicial field to improve the efficiency of judicial personnel in case processing has gradually become a research hotspot of legal intelligence.The legal documents contain abundant information of the case elements.By extracting the elements of the legal documents,it can assist the judge to obtain the required information more quickly and conveniently,and improve the efficiency of the judge's handling cases.This paper focuses on the method of identifying elements of legal documents,and the main research work is as follows:(1)Basic elements Identification for legal documents.The basic elements of legal document refer to legal document common case basic information,such as the case number,evidence name,and confirmation content,which can be extracted directly from the legal document.In this paper,JCWA-DLSTM is proposed to identify the basic elements of legal documents according to the characteristics of long character length and strong correlation between the two basic elements of the evidence name and the confirmation content.This paper use the pre-trained word-level language model to obtain the word representation containing character context to reduce the impact of word segmentation errors.At the same time,the self-attention mechanism is used to capture the dependencies between words levels,establishes the relevance of the basic elements,and realizes the recognition of the basic elements of legal documents.In addition,this method is tested with other baseline methods.The experimental results of the method proposed in this paper and the baseline method show that the value of F1 of JCWA-DLSTM has reached 91.70%,which is significantly better than baseline methods,indicating that this method is helpful for the identification of basic elements of legal documents.(2)Core elements identification for legal documents.The core element refers to the important fact description in legal document,which needs to be classified into the preset fact description element category according to the text semantics.Through observation,it is found that there are correlations and differences between element labels.In order to make full use of label information,this paper proposes a method of identifying the core elements of legal documents named HIAN.This method use hierarchical attention to capture label features and obtain the representation of specific labels to identify core elements.In addition,this method is tested with the baseline methods on datasets in three domains.The experimental results show that the value of Macro-F1 of the HIAN method presented in this paper is significantly higher than baseline methods,indicating that the HIAN method presented in this paper can capture more abundant label features and is effective for identifying core elements of legal documents.(3)The design and realization of legal document element identification system.In order to facilitate the automatic extraction of legal document elements by legal workers,this paper uses JCWA-DLSTM method and HIAN method to design and implement a legal document element identification system.The system interface is simple and the system is very convenient to use,and can complete the identification of legal documents.
Keywords/Search Tags:Legal Documents, Elements Identification, Self-Attention, Bidirectional LSTM, Hierarchical Interactive Attention
PDF Full Text Request
Related items