Font Size: a A A

Research On Event Extraction And Event Relation Recognition

Posted on:2024-09-09Degree:DoctorType:Dissertation
Country:ChinaCandidate:Q Z WanFull Text:PDF
GTID:1528307118454784Subject:Management Science and Engineering
Abstract/Summary:PDF Full Text Request
With the technological development of text mining and deep learning,extracting meaningful information from unstructured text has attracted more and more attention.As a specific form of information,events can be extracted and effectively organized(e.g.,building event associations and forming event graph)to discover significant knowledge,providing a foundation for subsequent research and downstream applications.In this thesis,our research includes:(1)sentence-level open event extraction,(2)document-level event extraction,and(3)event representation learning and event relation recognition.(1)Sentence-level open event extraction.Since the goal of closed event extraction is to identify events with pre-defined types along with corresponding arguments,this method is not suitable for new event types and the scenarios in which event types are not clear in advance.Existing work on open event extraction is insufficient,and there are many restrictions on the event structure,resulting in the missing of events.Also,the element completion for event is not considered in existing methods.To relieve these limitations,this thesis proposes two strategies for sentence-level open event extraction.In the first strategy,the characteristics of linguistic and syntactic dependency structures are analyzed,and 13 kinds of rules are formulated to identify events and complete the missing elements of events.The extraction performance on the experimental dataset attains 83.17%,outperforming the best baseline with 7.13 percentage points.For the other strategy,we focus on the deep learning technology and develop a multichannel hierarchical graph attention network to ease the weak applicability of the rule-based method in the first strategy.Specifically,a bidirectional dependency parsing graph is constructed for more semantics,followed by an optimized graph attention network that integrates the node level and type into the embedding aggregation.Finally,we develop a multi-channel mechanism to fuse different features.The experiments are constructed on Chinese and English datasets,and the effects improve 3.39 and 4.47 percentage points than the best benchmark,respectively.(2)Document-level event extraction.This task aims to address multiple events and arguments-scattering from the document level.Existing methods on document-level event extraction adopt the pipeline pattern and decompose the task into several sub-tasks,including entity extraction,entity-event correlation mapping,event type judgment,and argument role recognition.This pattern leads to inefficient extraction and error propagation.Furthermore,each existing method has some other limitations,such as the need to specify the number of events contained in documents,pre-defined argument role orders,and the decoding error in the gold entity-entity matrix.Therefore,this thesis develops two document-level event extraction models.First,a data structure is proposed to describe the matching relation among tokens,events,and argument roles,revealing the role of each token playing in each event.Thus,the multievent extraction task is transformed into a multi-classification prediction of argument role for(token,event)pairs,reducing the spatio-temporal complexity of the model.Meanwhile,the correlation semantics of tokens acting as argument roles in the same event and crossing events are leveraged,improving the extraction performance of document-level event extraction.Based on the data structure,a joint implementation framework for document-level event extraction is constructed by using a multi-channel learning mechanism.Experiments are evaluated on a public corpus(i.e.,Ch Fin Ann),and the result attains 88.9%,an increase of 9.5~33.2percentage points than the baselines.Second,to reveal which tokens play the argument roles in an event of the specific event type,we investigate a token-token bidirectional event completed graph with the relation(i.e.,event type-argument role-argument role)as the edge type,effectively solving the issues of multiple events,arguments-scattering,and multi-role arguments.Also,it overcomes the limitation of pre-defined number of events in the above document-level event extraction model.By converting the extraction task into a prediction and decoding of graph structure and edge type,a joint document-level event extraction framework is built to improve the efficiency and effect of the model.Consistently,experiments are evaluated on Ch Fin Ann and achieve 94.1%performance,5.2 percentage points higher than our first scheme and 15.3~38.4 percentage points higher than the baselines.We also implement the model on another corpus(Du EE-Fin),and the result is superior of the baselines by 26.9~47.5 percentage points.Furthermore,the model achieves a great improvement in spatio-temporal efficiency.In addition,this thesis also studies the topic event extraction from document level,which is a new research issue.The main target of this task is to understand the event semantics from document level and reveal the topic meaning revealed by documents in the form of events.To enrich event embedding representation,event graphs that reflect the internal and external structures of events are constructed.Then,according to the meaning of each graph,corresponding graph neural network is developed,resulting in a multifocal graph-based neural network scheme for document topic event extraction.Given that consistent methods for this task is limited,our scheme is compared with other similar models,including pre-training,event extraction,graph neural network,and subgraph network.It is superior of the best baseline by 10.69 percentage points.(3)Event representation learning and event relation recognition.Event representation learning is an essential procedure of event-related tasks(e.g.,event relation recognition).Also,events do not exist independently and are related to each other.Constructing the relations between events can sever downstream applications.Therefore,an event representation learning strategy and an event relation recognition strategy are explored.Event representation learning strategy.Since the work on event representation learning improves the differentiation of event representation by modeling the dot product of event elements and using various relations and external knowledge,they lose the event context and need to clarify the relations in advance,resulting in poor applicability.To this end,this thesis proposes an event representation learning method that is only based on the input text and does not depend on relations and external knowledge.The model includes a proposed three-step conversion strategy and two defined attention coefficients.Experiments are evaluated by two downstream tasks,and the experimental results outperform the best baseline by 2.70 percentage points and 2.25 percentage points,respectively.Event relation recognition method.Existing event relation type is insufficient,and event relation recognition model for multiple types is limited,which cannot meet the application requirements.Therefore,we first define six event relation types based on financial news texts and then develop a complete event relation extraction framework that contains three components(i.e.,event extraction,event combination,and event relation recognition).Specifically,the event relation recognition is implemented based on the BERT framework.The input and the architecture of BERT Embedding layer are optimized.The experiments are constructed on our annotated corpus.The recognition performance of our scheme exceeds the optimal benchmark by 4.87 percentage points.
Keywords/Search Tags:event extraction, topic event extraction, event relation recognition, dependency structure, graph neural network
PDF Full Text Request
Related items