Font Size: a A A

Design And Implementation Of Multidimensional Heterogeneous Data Source Management System In Data Sharing Platform

Posted on:2024-07-15Degree:MasterType:Thesis
Country:ChinaCandidate:Z A LiFull Text:PDF
GTID:2568307079971229Subject:Engineering
Abstract/Summary:PDF Full Text Request
In the era of big data,enterprises and organizations have accumulated a large amount of data resources,and massive data holds enormous value.However,the value of data can only be realized through circulation and sharing.In order to achieve open data sharing and address the conflicts between privacy protection,shared development,and security,researchers have proposed distributed data sharing platforms.These platforms aim to enable the circulation and monetization of data while ensuring data security,control,and preventing leakage.The main research work of this thesis is to design and implement the data source management functionality of a distributed data sharing platform based on the cloud-edge collaborative architecture.This includes the implementation of data cleansing,integration,and data migration functions at the data node end through the ETL subsystem.It also involves designing and implementing the data source management subsystem based on the cloud-edge collaborative network architecture,which provides functionalities such as data source access,metadata retrieval,and restricted data access management.The specific research work is as follows:(1)This thesis researches the design and implementation of an ETL system that supports real-time data capture,high reliability,and scalability.This involves building a task execution environment based on open-source components such as Debezium,Kafka,and Kettle.The interaction mechanism design between the ETL system management end and the task execution environment is studied.The support layer and task execution component are designed to enable control over ETL task execution,thereby integrating the capabilities of open-source components into the ETL subsystem..The functional layer is designed to manage the description information of ETL tasks.(2)This thesis researches implementation mechanism of data source management in cloud edge collaboration scenarios.The network architecture of the data source management subsystem based on cloud-edge collaboration is designed.Techniques such as adapter pattern and dynamic plugin loading mechanism are employed to achieve the adaptation of multidimensional and heterogeneous data sources.Additionally,research is conducted on the technical implementation of the full life cycle process management of data sources and data access control.Finally,the system’s functionality,security,and performance are tested,and the test results show that the system basically meets the data source management requirements of the platform.
Keywords/Search Tags:Data Sharing, ETL, Cloud-edge Collaboration, Data Source Management, Access Control
PDF Full Text Request
Related items