Font Size: a A A

Research On Knowledge Base Triple Denoising Method Based On Mapreduce Parallelization

Posted on:2021-09-28Degree:MasterType:Thesis
Country:ChinaCandidate:H Z GuFull Text:PDF
GTID:2518306050483734Subject:Management Science and Engineering
Abstract/Summary:PDF Full Text Request
Since the 1950s,Internet technology has gradually matured,and human beings have more choice to use the Internet for data storage and inquiry,which greatly improves working efficiency.Therefore,the establishment of network knowledge base becomes more and more significant.As the network knowledge base originated in western countries,the knowledge base of Chinese semantics is still in its infancy.Because there is not unification about the word segmentation rules,data acquisition technology and the entry compiled by open and free way.Users are able to classify and extract semantic information in Chinese network autonomy,suggesting that the information data content is not accurate and redundancy,leading to a lot of noise data in the Chinese semantic knowledge base,in order to clear the noise data,in this paper,based on the concept of data field,the first task is that calculating the semantic similarity.However,due to the huge data of Chinese network encyclopedia,the algorithms are very complicated,so the time and space resources occupied by serial programs,which are unacceptable to human beings.In this paper,a mapreduce parallel processing framework based on large data hadoop platform is proposed,which can both facilitate the internal resources of the algorithm and improve the execution time of the algorithm.The experimental results show that the framework has a significant effect on the calculation of massive knowledge data.
Keywords/Search Tags:Knowledge Base, Edit distance, Hadoop/mapreduce, Initial correlation, Denosing
PDF Full Text Request
Related items