Font Size: a A A

Research On Entity Extraction In Signal Processing Based On Dependency Word Vector

Posted on:2021-03-30Degree:MasterType:Thesis
Country:ChinaCandidate:Y H ZhangFull Text:PDF
GTID:2428330611999455Subject:Information and Communication Engineering
Abstract/Summary:PDF Full Text Request
At present,with the change of Internet technology,the amount of data is increasing explosively.Every user is the publisher and receiver of information.Therefore,how to extract effective structured information from massive data is a hot research topic in the current academic field.In natural language processing,named entity extraction task and its related downstream tasks can extract effective information.However,the current research on entity extraction mainly focuses on the fields with obvious business value,such as medical education,finance and so on.In this paper,we will focus on entity extraction in the field of signal processing,and based on the particularity of the scene,build entity extraction model in the field of signal processing and improve the performance of the model.The main contents of this paper are as follows:Based on mutual information,the terms of professional field are extracted automatically.Terminology is one of the differences between the professional field and the general field.The professional field terms are mostly compound words.In order to segment the terms correctly without destroying the internal structure of the compound words,a complete set of professional field terms is very important.The traditional automatic extraction of terms is mainly based on frequency and rules.The frequency based method will lead to incomplete terms extraction and high error rate.If rules extraction is used,rules will be frequently specified.In this paper,the mutual information method is used to extract terms in the field of signal processing.Mutual information can well represent the word string and the word string The strength of the bond.Construct a word vector generation model based on dependency.In addition to being different from other fields such as general field in terms,this paper finds that the expression of sentences in professional field is also quite different from that in general field.The sentence expression in the professional field is more clear and concise,and the reference is clear.Based on this difference,we propose to use dependency to represent the characteristics of sentences in expression,reconstruct the form of training corpus,and prove that dependency features can improve the semantic expression ability of word vectors,and save the training results for the downstream tasks.An optimization model of named entity extraction in the field of signal processing is constructed.The traditional named entity recognition model is mainly a rule-based matching system.This paper uses the neural network model based on bilstm + CRF to replace the extraction task with the classification task,avoiding the formulation of a large number of rules,and uses the dependency word vector and other constraintfeatures,such as word spacing,word shape,etc.,to optimize the extraction task performance,and finally adds the In addition,the attention mechanism increases the integrity of entity words extraction,and the final F1 value reaches 80.76%.To sum up,this paper deeply explores the specialty specialty of signal processing field,and optimizes named entity extraction model based on these specialty to improve the performance of extraction task.
Keywords/Search Tags:terminology extraction, word vector, dependency relationship, named entity recognition
PDF Full Text Request
Related items