Font Size: a A A

Research And Implementation Of Abnormal Data Processing Technology For Web Semantic Tables

Posted on:2023-02-26Degree:MasterType:Thesis
Country:ChinaCandidate:S J GaoFull Text:PDF
GTID:2568307061451244Subject:Computer technology
Abstract/Summary:PDF Full Text Request
With the rapid development of the World Wide Web,structured and semi-structured data such as HTML tables are rapidly increasing,which facilitates people’s knowledge acquisition and becomes an important data source for a large number of machine learning and training tasks.In the massive Web data,the table composed of semantic information content is called Web Semantic Table.In Wikipedia,Baidu Encyclopedia and other websites,the Web semantic table is open to users,and everyone can participate in editing,resulting in a large amount of abnormal data and even maliciously tampered information.Therefore,effectively identifying and repairing anomalies in Web semantic tables has become one of the important problems to be solved in the development of this field,and also has important practical significance.Web semantic table exception handling is mainly divided into error discovery and exception repair.The former can be regarded as fact checking for Web semantic tables,and the latter requires external knowledge to speculate and repair the content of error cells.Existing work has done a lot of research in this area,but there are still deficiencies: traditional table exception handling techniques are limited to tables with strict schema definitions,and it is difficult to deal with Web semantic tables with ambiguous or even wrong semantic information;fact-checking algorithms focus on Tables are used as a trusted source to check text sentences,etc.,and there is a lack of research on abnormal processing of Web semantic tables;when introducing external knowledge bases,the existing methods are limited to matching the relationship between knowledge graph entities and table columns,but with Web semantics The complex entity relationship between cells in the table is difficult to match directly.In response to these problems,this study proposes an error discovery and exception repair mechanism for Web semantic tables,which includes three parts:(1)In view of the difficulty of traditional algorithms to deal with the problem that the semantic information of Web semantic tables is ambiguous or even wrong,this study proposes a table error detection scheme based on the mapping relationship.It is used to characterize the meaning of the text,infer the column pattern information,and realize the function of error detection.(2)In view of the difficulty of existing research to deal with the problem of complex entity relationships in table format in Web semantic tables,this study proposes a table exception repair scheme based on knowledge graph.Based on the relationship between the two,a multi-hop query of knowledge graph entities and a table-oriented knowledge graph fusion anomaly repair model are designed to repair the wrong cell data found in the table,and infer the correct result.(3)Based on the above theoretical achievements,an exception handling system for Web semantic tables is designed and developed.The system adopts the method of separating front and back ends,integrates the above error discovery and repair parts,and realizes a user-oriented visual interface,providing The display of entity relationships is convenient for users to operate and use.To sum up,in view of the widespread errors and exceptions in Web Semantic Tables,this study designs an error discovery and exception repair algorithm for Web Semantic Tables,and implements an exception handling prototype system for Web Semantic Tables.Through the relevant experiments,this study proves that the scheme proposed in this study can effectively deal with the problems raised above.The algorithm mechanism and model system proposed in this study contribute to the further processing of Web semantic tables,and can be further applied to the related fields of tabular data processing.
Keywords/Search Tags:Web Semantic Table, Error Discovery, Error Repair, Semantic Representation, Knowledge Graphs
PDF Full Text Request
Related items