| Since the 19 th National Congress of the Communist Party of China,big data,artificial intelligence and other technologies have been used as means to unswervingly promote the comprehensive and strict governance of the party has become an important discipline inspection and supervision work direction.In the new era and new environment of "Internet + disciplinary inspection",the disciplinary inspection and supervision departments at all levels and the reporting departments and case handling departments have increasingly arduous tasks.The classification of disciplinary inspection cases and the extraction of event elements are the key steps of similar case recommendations,question and answer systems,and analysis of concatenated cases.Facing massive disciplinary inspection data,manual extraction of disciplinary inspection case event elements has severely restricted the efficiency of disciplinary inspection work.However,at present,there is no automatic event extraction work in the field of disciplinary inspection in China.Event extraction is one of the key tasks of information extraction.Compared with traditional machine learning methods,deep learning methods have powerful capabilities of feature learning and representation.In recent years,the event extraction has been widely used in various fields and made breakthrough progress.In the event extraction task,the deep learning method has gradually become the mainstream model.Compared with other event extraction tasks,disciplinary inspection case events have the characteristics of multiple types,special specifications,strong correlation between event arguments and event types,and more domain-specific nouns and terminology.These all propose new challenges for disciplinary inspection event extraction tasks.Based on the deep learning method,this research explores the task of extracting case events in the discipline inspection field.The main tasks are:(1)Construct a corpus of disciplinary inspection and supervision case events.At present,there is no corpus related to disciplinary inspection case events.This article uses crawlers to crawl the disciplinary inspection case data from the website of Sichuan Discipline Inspection Commission and the website of Liaoning Discipline Inspection Commission.For the event types and arguments involved in the text,we use BIO annotations to tag events,and established a corpus of discipline inspection for supervision events,which laying the foundation for follow-up work in this field.(2)The BERT-Bi GRU-CRF event extraction model is proposed.There are many proper nouns and terms in the discipline inspection field,and the word vectors of the open corpus training cannot express the semantics of the text well.This article uses the BERT model to train the discipline inspection corpus,constructs the text word vector in the discipline inspection field,and recognizes the event type and Argument extraction is regarded as a unified sequence labeling problem,which makes full use of the interaction between event types and arguments.The Bi GRU model is used to obtain contextual semantics,and the CRF model is used to realize event type recognition and argument extraction.Experiments show that the model proposed in this thesis can be better applied to the task of case event extraction in the field of discipline inspection and supervision.(3)Establish a system for querying disciplinary inspection cases and events.This article builds a disciplinary inspection and supervision event extraction system,which can quickly analyze all event information contained in the case,provide a convenient and efficient way for disciplinary inspection staff to extract events. |