| With the development of informatization,data has exploded and data quality problems have become increasingly prominent.In order to guarantee that data can provide information efficiently,data quality assurance has become one of the important tasks of the company.In the face of various data quality problems,most companies have developed proprietary data quality management systems based on their business needs.However,most of these systems do not provide a standardized representation of data quality dimensions and constraint rules,Therefore,there is an urgent need to address the issue of standardization and standardization of data quality,Oilfield companies are no exception.At the same time,How to describe the rules which is complex and defined at the schema layer or which is defined at the instance layer is also one of the current research hotspots.In addition,as the concept of open data is promoted internationally,the association with external data to obtain more authoritative data,structure or other necessary description will also effectively improve the data quality.In order to solve the above problems,first of all,this thesis studies the development of data quality dimension,constraint rules,data quality management framework and data quality assessment techniques at home and abroad to provide theoretical support for future research.Secondly,the research on related data technology provides the basis for solving data association problems.Third,based on the existing deficiencies of the data quality assessment model,combining the advanced technology and practical needs of the above research,with reference to the seven-step method proposed by Stanford University,a new method of constructing data quality management meta-ontology is proposed,and a universal data quality management meta ontology unrelated to domain is constructed.In the process of constructing the meta-ontology,the CDQ(Comprehensive methodology for Data Quality management)framework is introduced to provide the overall theoretical framework for the construction of the meta-ontology.On the semantic level,ontology technology is introduced to standardize concepts related to data quality,and solve the description problems of data quality constraint rules for complex and instance layers.In order to solve the problem of correlation between data,Linked Data technology is introduced to describe resources using URIs.Then,on the basis of the constructed data quality management meta-ontology,a data quality management reasoning based on SWRL rules is proposed to implement the reasoning of implied knowledge of data quality.Finally,based on the background of oilfield development,a prototype system for data quality management was designed and implemented.The system realizes data quality management,meta ontology parsing and maintenance,constraint rule and cleaning rule maintenance,SWRL rule reasoning,and quality evaluation and data cleaning based on reasoning,which verified the effectiveness and technical feasibility of the data quality research of oilfield development based on CDQ. |