| Knowledge Graph is a structured representation of knowledge.which can model entities,concepts,attributes and their relationships in the real world.Knowledge Graph plays an important role in Q&A,search,recommendation and risk assessment.Existing research focuses on the construction of Knowledge Graph,including information extraction,knowledge fusion and other construction techniques,but ignores the quality of Knowledge Graph.Different data sources,construction technology and updating maintenance may cause quality problems in the Knowledge Graph.It is of great significance to evaluate the quality of the Knowledge Graph.Although great progress have been made in traditional data quality assessment,the research of Knowledge Graph quality assessment is still in a relatively deficient state.There are two main challenges:(1)What dimensions and criterions should be included in the Knowledge Graph quality assessment system,and how to evaluate the integrity and rationality of the assessment system;(2)The quality evaluation method of the Knowledge Graph should be studied on the basis of theoretical system.The current researches are limited to one or several quality dimensions,so the evaluation results cannot reflect the overall quality of the Knowledge Graph.It is particularly important to realize a set of systematic Knowledge Graph quality assessment method under the guidance of the theoretical system.In view of the above two aspects,this paper studies the quality assessment system and assessment method of Knowledge Graph.The main contents are as follows:1.A multidimensional quality assessment system of Knowledge Graph is proposed:Through sorting out the existing data quality dimensions,six quality dimensions are summarized in order to meet the characteristics of Knowledge Graph and the objective universality of the assessment system.The definition,evolution history and corresponding quality problems of the six quality dimensions are introduced in detail.After that,the dimensions are subdivided into quantitative or qualitative evaluation criterions.Finally,the integrity and rationality of the multidimensional quality assessment system are analyzed.2.DBP600K-KQ and WD200K-KQ are proposed:In this paper,DBP600K-KQ and WD200 KKQ are constructed on DBpedia and Wikidata by randomly selecting entities and completing related knowledge.The two datasets are respectively 0.8% and 0.2% of the original knowledge graph in triples.Through statistical analysis,it is proved that the datasets are similar to the original Knowledge Graphs in sparsity,distribution and aggregation degree,and the quality of Knowledge Graphs were not affected during construction.Compared with other datasets,the assessment results on the constructed datasets can better reflect the quality of the original Knowledge Graph.In addition,since the two knowledge graphs are constructed in different ways,the universality of the assessment method can be verified through experiments.3.A set of Knowledge Graph quality assessment methods are proposed: Based on the above system,this paper designs calculation formulas and assessment methods for each dimension and criterion.Assessment is mainly based on the identification of various quality problems.Specific assessment methods are designed for complex quality problems,including negative correlation attribute mining algorithm based on association rules.Necessary attributes and unique attributes mining algorithm based on machine learning.Redundant entities and redundant attributes mining algorithm based on statistical analysis and semantic information.Finally,comprehensive assessment results are obtained on two datasets.Results show that DBpedia has higher quality in consistency and Wikidata has better performance in other dimensions.In addition,the correlation between criterion is analyzed.Except for the completeness criterion,most criterions have a certain positive correlation.Finally,based on the assessment results,quality conclusions and suggestions of the dataset are obtained. |