Font Size: a A A

Research On Financial Entity Relation Discovery For Text Data

Posted on:2021-01-27Degree:MasterType:Thesis
Country:ChinaCandidate:Y Q GanFull Text:PDF
GTID:2428330623468165Subject:Software engineering
Abstract/Summary:PDF Full Text Request
With the rapid development of big data,artificial intelligence and other technologies,the data-driven wave of intelligentization has brought new innovation opportunities and business models to the development of the financial industry.At present,a large amount of different forms of Internet text data are produced in finance every day.How to accurately and efficiently mine the important information in these financial texts so as to improve the efficiency of financial services is a key step in the development of financial intelligentization.This thesis mainly studies financial entity relation extraction task which is the core task of financial information mining.The goal of this task is to identify financial entities from textual data and determine the semantic relationships between entities.This thesis explores entity relation extraction methods from the perspective of pipeline and joint entity relation extraction,analyzes the shortcomings and defects of existing work,and proposes corresponding improvements,so as to better improve the performance of entity relation extraction tasks.Specifically,the main contributions of this thesis are as follows:1.Investigate the existing work of pipeline and joint entity relation extraction methods,sort out and summarizes the research status of related work,and analyze the shortcomings and defects of the existing work.2.In the perspective of pipeline entity relation extraction,aiming at the problem of model with insufficient utilization of semantic relation information,we propose a relation extraction model based on the enhancement of entity subsequence.In this model,the attention and utilization of entity subsequence information is enhanced by further coding entity subsequence of input sentence independently,while the original long sentence information is retained;In the feature fusion layer of the model,entity dependent attention is introduced to guide the model to pay more attention to important semantic information related to entities,so as to improve the classification ability of relation features.The experimental results show that the proposed model achieves better performance than the current mainstream model on the SemEval-2010 Task 8 public dataset,and has achieved a nearly 2% improvement over the F1 score of the baseline on the financial relation dataset.3.In the perspective of joint entity relation extraction,in order to tackle the noisy problem and the defect of relation feature expression in model,we propose a multi-head selection joint extraction model based on loss optimization and entity subsequence representation.The model reduces the impact of noisy problems such as class imbalance and entity missing labeling in entity recognition through the joint effect of two loss optimization strategies;In the relation classification layer,in addition to using entity coding information,entity subsequence representation is introduced to further enhance the ability of relation feature expression.The experimental results on CoNLL04 public dataset show that the performance of the proposed model is significantly better than the current mainstream model,has achieved a nearly 4% improvement over the overall F1 score of the baseline on the financial relation dataset.
Keywords/Search Tags:Text Data, Entity Relation Extraction, Pipeline Extraction, Joint Extraction
PDF Full Text Request
Related items