Font Size: a A A

Research On Event Extraction And Fusion Technology Based On Deep Learning

Posted on:2024-05-20Degree:MasterType:Thesis
Country:ChinaCandidate:Z Q ZengFull Text:PDF
GTID:2568307079471324Subject:Electronic information
Abstract/Summary:PDF Full Text Request
Event extraction and coreference resolution are crucial technology for extracting target information from massive unstructured texts.Their essence are to extract structured core elements from redundant and complex texts,and then further align and fuse them to obtain a complete and streamlined event information.However,due to the complex semantic environment and multi-source heterogeneous data in real life,this research field faces many challenges such as low extraction accuracy and poor model generalization ability.This thesis focuses on event extraction and coreference resolution tasks,and conducts the following research on its key technologies based on deep learning:Firstly,in response to the low accuracy and difficulty in solving the problem of overlapping elements in traditional sequence labeling event extraction models,this thesis proposes a segmented event extraction method based on ERNIE-Bi LSTM-CRF(EBC).Based on the sequence labeling method,EBC uses a large pre-trained model to replace the traditional neural network to extract text features.At the same time,it adds Bi-LSTM layer to capture long-term dependencies,and adds CRF layer to learn sequence features.Information is integrated into the extraction process as prior knowledge,and a layer of label output is constructed for each type of event contained in the text.The experimental results show that compared with the best existing event extraction methods,the EBC model improves the F1-score of trigger extraction and role extraction by 1.26% and 0.97%respectively on the Du EE dataset.On the multi-event text extraction task,the F1-score are increased by 0.77% and 9.90%.Secondly,by analyzing the characteristics of sequence labeling and machine reading comprehension methods and their performance differences in event extraction tasks,this thesis proposes a sequence labeling event extraction method based on machine reading comprehension correction and filling(SL-MRC).SL-MRC takes sequence labeling as the main body and machine reading as the correction and filling module.It fully leverages the strong extraction ability of the sequence labeling model for large-scale,and retains the extraction ability of the machine reading model for few-shot or even zero-shot,making the model simultaneously accurate,data-efficient,and cold-starting.The experimental results show that the SL-MRC method has achieved good performance in both large-scale and few-shot.On the basis of the EBC model,the F1-score of the trigger and role extraction tasks continued to increase by 1.61% and 0.82%.Finally,for the event coreference resolution,this thesis proposes a short-text event coreference resolution algorithm based on context prediction(ECR-CP).From the perspective of problem modeling,ECR-CP transforms the event coreference resolution task into a context prediction task,and determines whether they refer to the same event by studying whether two event descriptions can form coherent upper and lower sentences.Furthermore,event category,trigger word,argument,entity alignment and other information obtained from event extraction are fully integrated into the feature engineering.The experimental results show that the ECR-CP method can effectively complete the event pair coreference resolution task.Compared with the traditional methods based on semantic similarity calculation,the recall rate and F1-score of the ECRCP model on the CCKS2021 dataset have increased by 8.75% and 7.6%.Compared with the existing pre-trained model based methods,the recall rate and F1-score have increased by 4.98% and 3.06% respectively.
Keywords/Search Tags:Event Extraction, Event Coreference Resolution, Sequence Labeling, Machine Reading Comprehension, Deep Learning
PDF Full Text Request
Related items