Font Size: a A A

Design And Implementation Of Data Analysis System For Precursor Chemicals Based On Spark

Posted on:2020-04-24Degree:MasterType:Thesis
Country:ChinaCandidate:H S LiFull Text:PDF
GTID:2381330575966742Subject:Computer technology
Abstract/Summary:PDF Full Text Request
Comprehensive technologies and resources for big data provide effective methods for improving the efficiency of management,which is helpful for supervision departments to collect and use the trading data of precursor chemicals for the demand of the dynamic market supervision.The Information System On Supervision And Management Of Precursor Chemicals produces a large number of transaction application records every day.The traditional centralized data analysis methods are not suitable for real-time analysis.There exists some problems such as limited analysis conclusions and high time delay.Comprehensive technologies and resources for big data provide effective methods for improving the efficiency of management,which is helpful for supervision departments to collect and use the trading data of precursor chemicals for the demand of the dynamic market supervision.This thesis aims at the problems of real-time processing the precursor chemicals data and price prediction of precursor chemicals in the industry.The author designs and implements a data analysis system based on Spark.The system can perform real-time analyze the trading situation of precursor chemicals according to real-time data and also can predict the price of precursor chemicals in a short term.Adopting the popular big data technology such as Kafka and Flume reduces the complexity of system development and maintenance,which builds the foundation for the further business development.The main tasks of this thesis are as follows:(1)Design and implementation of Data Analysis System for Precursor Chemicals Based on Spark,which mainly consists of ETL module,data processing module and visualization module.The system is able to collect the trading data of precursor chemicals from multiple provinces and cities in real time,and to perform data cleaning,statistics and storage.It is convenient for users to query the real-time dealing situation of precursor chemicals.(2)Aiming at the problem of short-term peak load in the business of precursor chemicals,combined with the model of Back Pressure,an optimization strategy is proposed for adjusting batch interval of data stream to solve the problem of system high delay.(3)Actual demand for short-term forecasts of prices of precursor chemicals for companies and regulators.Three common regression analysis algorithms are implemented with Spark ML machine learning library.Using MSE as evaluation standard,the experimental results are compared.Finally the thesis chooses the Isotonic regression algorithm as the basic method to predict price.
Keywords/Search Tags:Real-time processing, Batch interval optimization, Regression analysis, Price prediction
PDF Full Text Request
Related items