| In today’s information age,Internet-related technologies have been applied in many fields.Knowledge graph technology is one of the important technologies of it,which can represent massive information and its relationships.In order to maintain the inherent structure of knowledge in knowledge graph and facilitate computation,knowledge graph embedding technology is proposed,which uses embedding vectors to represent entities and relationships in knowledge graphs.Each knowledge subgraph is a part of the knowledge graph,which is generally constructed according to a certain strategy on a given knowledge graph,and its specific size can be set according to actual needs.Compared with the original knowledge graph,the research on knowledge subgraphs has the following advantages: 1.For a relatively large knowledge graph,studying its subgraphs can reduce its time and space overhead to a certain extent,and it can facilitate model optimization;2.When studying knowledge graph subgraphs,in the process of constructing subgraphs,we can focus on the connection between its entities and relationships,and the information at the semantic level is emphasized,and build knowledge graph subgraphs according to different types of information.Ensemble learning combines multiple weak learners through a certain combination strategy,so as to complete the learning task better.Considering the improvement of the ensemble learning technology for the accuracy of the model,this paper applies the ensemble learning technology to the knowledge graph subgraph.By constructing multiple knowledge graph subgraphs,training multiple subgraphs separately,and using the ensemble method to integrate the training effects of multiple basic learners,the model can obtain better performance and improve the generalization performance of the model;The strategy of training a single subgraph and then using the ensemble strategy,which can reduce the impact of noise in the complete knowledge graph on the model.The main work of this paper is as follows:(1)According to the different relationships of triples in the knowledge graph,considering the frequency of occurrence of entities and relationships,taking into account the relationship between triples with the same entity in the subgraph and the generality of the subgraph,the knowledge graph subgraph is designed,the knowledge graph subgraph construction algorithm of this paper is proposed.This paper introduces the random walk method in the construction process and the generation of the walk path,and it integrates the generated paths to complete the construction of the knowledge graph subgraph.(2)According to the ensemble learning methods of Bagging and Boosting,on the basis of Bagging algorithm and Boosting algorithm,combining with knowledge graph embedding technology,two specific algorithms are proposed in this paper,namely subgraph Bagging algorithm and subgraph Boosting algorithm.The two algorithms respectively integrate the models obtained by the basic learners,taking advantage of the basic learners,and they finally obtain a strong learner with better performance.(3)The knowledge graph subgraph is constructed according to the knowledge graph subgraph construction method proposed in this paper.It is used for each basic learner in the subgraph Bagging algorithm and subgraph Boosting algorithm based on ensemble learning proposed in this paper,and the training is completed on the subgraph.At the end of training,the models are integrated for link prediction tasks.In this paper,experiments are carried out on four data sets of FB15 K,FB15K237,WN18,and WN18 RR.The subgraph is constructed by using the knowledge graph subgraph construction algorithm proposed in this paper,and comparing the experimental results of the two ensemble learning algorithms in this paper with the benchmark algorithm,and both of them are on the subgraph.The results show that the method proposed in this paper is overall better than the benchmark algorithm under the two evaluation indicators of Mean Rank(the average rank of the correctly predicted entity in the entity credibility ranking)and Hits@10(the proportion of the correctly predicted entity in the top ten of the entity credibility ranking),and improves the performance of the model to a certain extent. |