Font Size: a A A

Research And Realization Of Identification Method Of Legal Document Elements

Posted on:2021-02-03Degree:MasterType:Thesis
Country:ChinaCandidate:Y C WuFull Text:PDF
GTID:2416330647958908Subject:Computer technology
Abstract/Summary:PDF Full Text Request
With the increasing awareness of law,judicial resources as an important means to protect people's production and life become more and more important.In order to effectively avoid the waste of judicial resources,in 2018,the state launched a key R & D plan for "Research on Key Technologies of cross regional and cross level related early warning and collaborative disposal of one person and multiple cases".The so-called "case of one person and more than one person" refers to the case in which one or both parties are the same,or the cause of action and legal facts are the same.Therefore,it is a key technology to identify the elements of legal documents accurately.Based on the major research plan,this thesis focuses on the identification technology of legal document elements,and carries out in-depth research from three aspects: data set construction,model construction and optimization,and relevant experimental verification.The specific contents are as follows:1)Data set construction: build a data set of legal document elements identification consisting of a large number of legal documents.The main work includes collecting a large number of legal document data and cleaning,designing appropriate labeling specifications,standardizing the data length according to the characteristics of legal documents,and making efficient labeling procedures.2)Deep neural network model construction: This thesis takes word2 vec pre training training training vector,bidirectional long short memory neural network and character level named entity recognition model(Bi LSTM-CRF)composed of random conditions as the benchmark,and researches on some shortcomings:(1)For the problem of "polysemy of a word",an improved Bert design(Att BERT)based on attention mechanism is proposed Methods: first,Bert is used to dynamically adjust the word vector according to the semantic change to solve the problem of "polysemy of a word".Then,based on Bert,attention mechanism is used to further obtain more abundant character feature information to improve the overall named entity recognition effect.(2)Aiming at the problem of insufficient use of structured features of legal documents,a local based method is proposed Position improved design method(Position),which effectively utilizes the structured features of legal documents through windowed attention mechanism,so as to improve the overall named entity recognition effect;(3)Aiming at the problem of insufficient recognition ability of complex and long entities,an improved design method based on LSTMDecoder(LSTMD)is proposed,which is better through LSTM decoder To capture the context dependence of entities and combine it with CRF,so as to improve the recognition ability of complex and long entities;(4)To solve the problem of "full text annotation inconsistency" of long text,an improved design method based on global attention mechanism is proposed(Attention),which uses attention mechanism to capture global information to strengthen the semantic relationship between contexts,so as to solve the problem“ Full text annotation inconsistency.3)Perfect experimental design: according to the above methods,the corresponding experiments are designed to obtain the optimal super parameters,and the effectiveness of the proposed method is verified by comparing with the experimental results of the baseline model.Cross experiments were carried out to analyze the effect of each method on the baseline model.The experimental results show that the accuracy of key elements identification of the baseline model is 96.02,the recall rate is 95.82,and the F1 value is 95.92.Compared with baseline model,the improved method(Att BERT-Pos-Bi LSTM-LSTMD-Attention-CRF)improves the accuracy of key elements identification by 2.3%,the recall rate by 2.61%,and the F1 value by 2.45%.
Keywords/Search Tags:element recognition, attention mechanism, bidirectional long short-Term memory neural network, conditional random field
PDF Full Text Request
Related items