| In recent years,with the continuous advancement of my country’s judicial information construction process and the gradual implementation of the policy of online judgment documents,the number of judgment documents published on the Internet has exploded.Judgment documents are of great significance to help legal staff clarify the focus of disputes,analyze cases,and improve the quality and efficiency of conflict resolution.However,most of these judgment documents are unstructured data with lengthy content,resulting in low data value density and ineffective support for decisionmaking.Considering that the judgment document is case-centric and contains rich event information,the event extraction of the judgment document can extract structured information with the event as the core from the judgment document more quickly and accurately.However,the current method of event extraction is not well applicable to judgment documents.First,the traditional event structure cannot express the relationship between the arguments in a legal event,and the classification of traditional event types is not conducive to the classification of similar crime event types in the judgment document.Secondly,the previous method cannot solve the long-distance dependence problem and the problem of reference resolution in the event extraction of judgment documents.Therefore,this thesis starts from the characteristics of legal events in the judgment document,defines the dynamic hierarchical event structure suitable for the judgment document,and constructs the judgment document event extraction data set,and proposes a dynamic hierarchical event extraction model based on the pedal attention mechanism.In addition,the deep learning event extraction model requires a lot of supervised data for training,which conflicts with the high cost of labeling.Therefore,this thesis introduces active learning to reduce the cost of labeling,and assists in the extraction of judgment documents.However,the judgment document event extraction has a complex form of single sample and multi-task.The existing active learning cannot well measure the importance of the sample to the model.Therefore,this thesis designs a new active learning framework and proposes a memory-based Loss prediction model to evaluate sample importance.Specifically,the main research work of this thesis is as follows:(1)This thesis analyzes the characteristics of the types of judgment documents and defines the dynamic hierarchical event structure;referring to the ACE2005 event labeling specification,a data set containing 2380 judgment documents is constructed.(2)This thesis designs a hierarchical event extraction model based on pedal attention mechanism based on the self-built event extraction data set of judgment documents.Hierarchical event features are designed to distinguish similar events.LTP is used to construct the dependency syntactic features of sentences,and pedal attention mechanism is used to solve the problems of long-distance dependency and anaphora resolution in event argument classification.(3)In this thesis,an event extraction framework based on active learning is proposed to solve the problems of difficulty and high labeling cost in event extraction supervision data during the popularization and application of the proposed event extraction algorithm.At the same time,according to the characteristics of event extraction task and the model performance in the training-labeling iteration process,a sample selection strategy for delay loss prediction is proposed.(4)Based on the event extraction model and active learning training strategy,this thesis designs and implements a judgment document event extraction system.On the one hand,the system can realize automatic event extraction from judgment documents;on the other hand,based on the active learning method,the system provides the function of online event labeling,which can solve the high cost of event labeling,insufficient training corpus,and difficulty in expanding event categories in practical applications.The problem has improved the flexibility of the system.The main contribution of this thesis is to construct a judgment document event extraction data set based on the characteristics of the judgment document event type;for the difficulty of distinguishing similar events in the judgment document and the long-distance dependence and reference resolution in the event argument,a pedal-based attention is designed.A hierarchical judgment document event extraction model based on the force mechanism;research on active learning technology to solve the problem of difficulty in event extraction and supervision data labeling,and propose an event extraction active learning training strategy based on loss delay prediction;finally,through demand analysis and combined with the event extraction studied in this article The model and active learning technology have realized the prototype system of judging document event extraction. |