| In recent years,knowledge graphs have been widely used as an efficient data organization form in many fields.As different institutions and organizations construct knowledge graphs tailored to their own needs,there is a situation of complementary information between different knowledge graphs.Therefore,it is necessary to integrate different knowledge graphs into a larger and more widely covered knowledge graph.The key step in knowledge graph fusion is entity alignment.The purpose of entity alignment is to identify two entities from different knowledge graphs that refer to the same concept in the real world.Most existing entity alignment models for knowledge graphs rely heavily on the structural information of the knowledge graph,while other types of information are less utilized,especially the lack of utilization of multimodal information.Although there have been a few works that utilize multi-modal information,these works typically aggregate the information from different modalities with fixed weights,thereby ignoring the varying importance of different modalities in different conditions.In addition,entity alignment tasks require pre-aligned entity pairs as training data.Previous studies have shown that the quantity and quality of training data directly affect the effectiveness of entity alignment.However,acquiring these training data is not an easy task.Therefore,this paper mainly addresses the above problems,and the main contributions are as follows:(1)This paper proposes a multi-modal knowledge graph entity alignment model.In addition to using structural information,this model also uses relationship,attribute,entity name,and image information,especially the use of image information.This paper model first uses multiple separate encoders to generate modality-specific embedding representations for each entity.Specifically,this paper uses GCN to model knowledge graph structural information;for relationship and attribute information,they are treated as bag-of-words features,the entity names are processed by averaging pre-trained Glo Ve vectors to obtain entity name features,each of these features is then inputted into a simple feedforward neural network to obtain their respective embedding representations;for images,this paper uses a pre-trained visual model Res Net-152 to obtain visual features,and similarly inputs these visual features into a feedforward neural network to obtain embedded representations of visual information.When integrating information from different modalities,this article did not generate a final embedding representation for each entity like most models do.Instead,it first generated entity similarity matrices in different modalities,then integrated the entity similarity matrices from different modalities to generate the final entity similarity matrix,and obtained the alignment results based on the entity similarity matrix.(2)In generating the final entity similarity matrix,this paper uses an adaptive feature fusion strategy,which is a method that dynamically assigns aggregation weights to different modal information of entities.This method dynamically adjusts the weights of different modal information based on the richness of entity structural information,and increases the weight of other modal information when the structural information of an entity is relatively scarce.Through experiments,it has been verified that the adaptive feature fusion strategy can effectively improve the accuracy of alignment for long-tail entities.(3)In order to overcome the problem of insufficient training data,this paper designs an iterative strategy to expand the size of training data.This is a semi-supervised method that still requires a small amount of pre-aligned entities as training data,which means that this method cannot completely eliminate its dependence on training data.Therefore,based on the semi-supervised iterative strategy,this paper further designs two completely unsupervised iterative strategies that do not require any pre-aligned entity pairs as training data。... |