Font Size: a A A

Research On Data Integration Technology Of Oilfield Regional Data Lake

Posted on:2022-04-19Degree:MasterType:Thesis
Country:ChinaCandidate:N HouFull Text:PDF
GTID:2481306329952989Subject:Master of Engineering
Abstract/Summary:PDF Full Text Request
In the process of building a regional data lake in an oil field,the biggest difficulty is the integration of different data sources.The regional data lake involves all oil and gas exploration and development business domains of local oil fields.Due to the requirements of the source application system,there are various database types,which form data centers,which hinder the intercommunication of local data and cause heterogeneous data integration.Many technical problems,so data integration technology is urgently needed to solve these problems.However,the data integration of regional data lakes still has the following problems:heterogeneous data formats,data integration technology in the process of data exchange,will find that an object description will have different ways,such heterogeneity will cause data redundancy.The remaining errors affect the mapping and exchange of data.How to solve this heterogeneous problem,how to perform data mapping between different data items,and how to determine the matching relationship is the focus of data integration technology;for data mapping technology,the traditional method is to manually build a thesaurus to represent different data The matching relationship of data items between models.When you need to add or change data items,you also need to update manually,which is very cumbersome.Therefore,an automatic way is needed to match the relationship between data items in different data models in the library;due to the structured,semi-structured,and Unstructured data is multi-source and heterogeneous,which makes it inconvenient for data services such as data exchange and query.In response to the above problems,this paper studies the core data integration technology of the oilfield regional data lake based on the data resource catalog,starting from the two aspects of data mapping and data service,and proposes the data integration technology of the oilfield regional data lake based on the data resource catalog.Data model semantic mapping technology of data element keywords and data services based on application data sets.The specific research content includes the following aspects:1.Research on data element semantic description based on data element keywords.By analyzing the related concepts of data elements,the semantics of data elements are expressed through keywords such as object words,qualifiers of object words,characteristic words,and qualifiers of characteristic words,so as to realize the semantic description of data elements.2.Research on data resource catalog and regional data lake mapping technology.By analyzing the structure of the data resource catalog model and the regional data lake model,the semantics of the two data models are described based on the data elements,and the mapping relationship between the two models is automatically established according to the edit distance and the Jaccard similarity algorithm to realize the data resources Integration of the catalog model and the regional data lake model.3.Research on data service of data lake in oilfield area.According to the mapping relationship between the data resource catalog application data set and the regional data lake data model,based on the data resource catalog application data set,data query and exchange services are realized.This paper studies the core data integration technology data mapping and data service of oilfield regional data lake,which has certain theoretical significance and practical value.
Keywords/Search Tags:data integration, regional data lake, data resource directory, data mapping
PDF Full Text Request
Related items