Font Size: a A A

Design And Implementation Of A Real-time Data Warehouse For Industry

Posted on:2023-11-15Degree:MasterType:Thesis
Country:ChinaCandidate:J Y LvFull Text:PDF
GTID:2568306815991399Subject:Computer technology
Abstract/Summary:PDF Full Text Request
The development of the big data era has brought both opportunities and challenges to all walks of life in our society and has further accelerated the combination of big data technology and industrial manufacturing with the introduction of the concept of Industry 4.0 and the document named Made in China2025.Therefore,how to extract value from industrial data through big data technology so as to provide decision support,optimise production processes and to realise intelligent manufacturing has become the key point of the big data research in the industrial era.However,the 4V characteristics of Big Data has been significantly enhanced in the industrial sector due to the increasing scale of manufacturing plants and the number of production equipment.Traditionally,offline data warehouses are often used to deal with the storage and processing of massive amounts of data,but they could not provide an effective solution when it comes to higher time-sensitive needs,such as real-time calculation of monitoring indicators and real-time monitoring and warning of equipment in factories.Dealing with such problems needs a brand-new processing method.Based on the processing of industrial real-time data,this paper designs a Flink-based,industry-oriented real-time data warehouse system which consists of four modules——data integration,data processing,data storage and data services.The data integration module uses different collection methods for business data and stream data and transfers them to the real-time data warehouse via Kafka.The data processing module which implements dimensional modelling of real-time data can be divided into ODS layer,DWD layer,DWS layer,ADS layer and DIM layer.Each layer of data is calculated by Flink and forwarded by Kafka while real-time early warning of industrial equipment and environmental conditions is realized by CEP technology.The data storage module realizes categorical storage of data by using Click House database to store the DWS layer data;HBase and Redis to store the dimensional data;and Kafka to store the intermediate layer data.The data service module realizes the interactive function between users and industrial real-time data warehouse.In the industrial field,due to the complexity and dynamics of data,it is often difficult to manually formulate CEP rules for real-time early warning.To solve this problem,this paper designs an automatic CEP rule extraction framework based on genetic algorithm,which can automatically extract CEP rules required for real-time early warning from historical data,and verifies the accuracy and feasibility of the algorithm through experiments.Finally,this paper builds the system environment through distributed cluster to realize the specific functions of each module of the real-time data warehouse,and compiles the visualization interface to display the data of the data warehouse to realize the processing and analysis of industrial real-time data as well as provide the functions of data statistical analysis and equipment monitoring and pre alarm for industrial production.
Keywords/Search Tags:Industrial big data, Real-time data processing, Data Warehouse, Complex event processing
PDF Full Text Request
Related items