| In the age of Internet,which is rich in text information,entity classification as the task of extracting entity information in text has very important research value.Due to the problems of broad semantics and flat structure,traditional entity categories can not meet the needs of other related technologies in the field of natural language processing.With the continuous refinement of entity categories in semantics,fine-grained entity typing method has a very important research value.The problem caused by the continuous refinement of entity types is the lack of fine-grained entity tagging corpus.Although the corpus can be obtained through automatic corpus generation technology,the corpus will contain noise.In view of the above problems,this thesis studies the finegrained entity classification method based on the multi-level features of the text on the basis of the identified entity reference location,and considers the two cases of a small number of human annotated and a large number of noisy samples.In the first case,the fine-grained entity typing with prototypical networks mothod is proposed.The Macro-F1 and micro-F1 of the method are 81.5%and 81.5%respectively in the first-level types of FIGER(GOLD)corpus,which is 0.5%and 0.5%higher than the recent research.In the second case,the fine-grained entity typing with hierarchical inference method is proposed.The micro-F1 of the method is 80.0%,72.5%and 83.9%respectively on FIGER(GOLD),Ontonotes and BBN corpus,which is 1.0%,0.5%and 3.4%higher than the recent research.The main contributions of this thesis are as follows:1.The multi-level feature extraction method is proposed.Previous studies only extract part of the text information,or fail to fully interact with different levels of information.The thesis extracts the features of the text based on the word level and the character level respectively,and make the entity reference interact with the context based on the structured self attention method,so as to get a more comprehensive text representation.2.The fine-grained entity typing with prototypical networks mothod is proposed.Previous studies have failed to consider the semantic changes of entities in different contexts,and failed to make effective use of external knowledge.The relationship between entity and context is obtained based on multi-level features,and the task of fine-grained entity classification is implemented with prototypical networks in the case of a small number of labeled samples.At the same time,the research is carried out under two experimental settings.The few-shot experimental setup studies how to learn the hierarchical relationship of categories in a small number of manually labeled samples,and the zero-shot experimental setup studies how to use external knowledge to transfer the category information learned from manual tagging to new categories.The hierarchical optimization method is added to the few-shot experimental setup to improve the classification effect,and the pre-training language model is used as external knowledge to construct the prototype of the category in the zeroshot experimental setting.3.The fine-grained entity typing with hierarchical inference method is proposed.Previous studies have failed to effectively utilize the hierarchical relationship between entity types,and the existing methods of data noise reduction will produce the phenomenon of confirmation bias.The hierarchical inference method is used to improve the model’s ability to use the hierarchical relationship of categories,and the top penalty term is added to the optimization objective to alleviate the impact of confirmation bias.4.The fine-grained entity typing system with multi-level features is designed and implemented.According to the input text,the system first finds the location of the entity in the text,and then assigns fine-grained labels to the entity according to its context. |