Font Size: a A A

Open Domain Event Type And Schema Induction

Posted on:2024-01-29Degree:MasterType:Thesis
Country:ChinaCandidate:Q YangFull Text:PDF
GTID:2568306914960039Subject:Information and Communication Engineering
Abstract/Summary:PDF Full Text Request
The task of event extraction is to present the event information of unstructured texts in a structured form.According to whether the event schema is given,it is usually divided into specific domain event extraction and open domain event extraction.Complete event schemas consist of multiple event types and their corresponding event schemas.One event schema defines the necessary argument roles of one event type.Given the event schemas of specific domain,specific domain event extraction processes the natural language texts and identifies the event type and its arguments.Open domain event extraction requires structured representation of event information in text without predefined event schemas.The induction of complete event schemas usually requires annotations by domain experts,who cannot pre-define all possible event types and schemas.Therefore,it is a very challenging task to induct new event types and corresponding event schemas from open domain event texts.There are still many deficiencies in the existing research methods.For the subtask of new event type induction,existing methods not only rely on a large amount of annotated data,but also set up complex objective functions for labeled and unlabeled data to learn.Parameter balance is often required between different objective functions.For the subtask of event schema induction,the existing researches usually realize this task together with event type induction.The obtained event argument roles mostly rely on syntax parsing tools,or are limited in the candidate role glossary manually defined,therefore the inductive event schema is lack of the different event characteristics.Therefore,this thesis studies open domain event type and schema induction,which is divided into two parts:new event type induction and event schema induction for each specific event type under open domain condition.For the task of inducting new event types,this thesis proposes a new event type induction method based on reliable pseudo-label prediction,aiming at simplifying the complex objective function of existing methods.Given some labeled data of seen event types,this method first obtain the prediction results of all types by concatenation,then Double Label Reassignment strategy is designed by combining different deep clustering algorithms to construct reliable pseudo labels for unlabeled data.In this way,the model can learn both seen and unseen event types with a unified objective function,and automatically induct new event types of unlabeled texts.Experiments show that our approach outperforms the state of the art on the benchmark without parameters to balance multiple complex objective function.For the event schema induction of one specific type,this thesis proposes a specific event schema induction method based on dual prompt learning,aiming at automatically inducting the event argument roles with event characteristics.For the clustered event mentions of each event type,the model first identifies the named entities in all event mentions as the candidate arguments,then uses the dual prompt learning method to name and verify the candidate argument roles in the whole vocabulary,finally iteratively merges same semantic argument roles through the variable threshold,so as to induct its event schema.Experiments show that the precision and recall of proposed method exceeds existing methods.Meanwhile,the analysis of downstream tasks and examples shows that the proposed method can induct event schemas with more event characteristics than the existing methods,which is conducive to the present open domain events in a structured form.
Keywords/Search Tags:open domain event extraction, event schema induction, pseudo label, prompt learning
PDF Full Text Request
Related items