| Natural language texts have a certain degree of ambiguity,especially short texts such as comments,searches,and dialogues.We need to eliminate ambiguous information through entity linking,accurately obtain target entities,and play a crucial role in downstream tasks such as building knowledge graphs and intelligent question answering systems.Entity linking is the task of associating entity referents in unstructured text with corresponding database target entities.It mainly includes two steps:mention detection and entity disambiguation.Most studies address these two steps as independent tasks,but they may lead to issues such as error propagation and insufficient information utilization.The research objective of this article is to treat mention detection and entity disambiguation as joint tasks in the context of short text entity recognition and chaining tasks,achieving end-to-end entity linking,Although some progress has been made in the existing end-to-end entity linking work,there are still many problems and challenges.The main research findings of this article are as follows:(1)Aiming at the problems of the lack of entity semantic information and prior knowledge and the low efficiency of large-scale data task model,an end-toend entity linking model BCM based on BERT joint Bi-encoder and Crossencoder is proposed to enhance the interaction between the text where the reference item is located and the entity information in the database.Then the knowledge distillation strategy is introduced to reduce the computational complexity of the model,improve the training and inference speed of the model,At the same time,the complexity of the model and the number of parameters is reduced,and the performance of the model is improved.Through experiments on both universal and zero-shot datasets,it was verified that the model achieved the best F1 value compared to the current end-to-end model,had strong scalability,and ran at a faster speed than most models.(2)In order to solve the problem that the context of short text is not rich and Semantic information learning is not sufficient in entity linking task,an end-toend entity linking model is constructed to enhance the granularity of semantic representation and entity type fusion.The model is based on the BCM structure.In view of the fact that local features may be too sparse to provide sufficient information for disambiguation,a global entity linking method is proposed to global optimization the consistency between entities mentioned in the same text.And further explored entity type features,proposing three strategies:enhanced entity representation,strong negative example sampling,and category constraints.The effectiveness of the model was verified through experiments on the dataset,and the evaluation indicator F1 also achieved SOTA level.In summary,entity recognition and chaining tasks for short texts address the problem of entity linking,which involves a lack of available feature information for specific domain data,low efficiency of large-scale data task models,and limited context of text context.With the main goal of enhancing the interaction of existing feature information and improving the efficiency of large-scale data models.We constructed an end-to-end entity link model that combines high efficiency and performance in joint disambiguation by constructing the model structure and mining feature attributes. |