| In recent years,with the continuous acceleration of industrialization and urbanization in our country,more and more industrial wastewater and municipal wastewater are discharged into rivers and lakes,frequent water pollution incidents have seriously restricted the healthy development of our society and economy.In order to be able to grasp the real-time changes of surface water quality,information technology has gradually been introduced into the field of water environment.However,most of the current water environment monitoring systems can only obtain and display the water quality information of the main sections,the management functions are seriously lacking,the pollution source information and water conservancy information are not connected with the monitoring systems.As a result,related data resources cannot be shared,the refined management of water environment cannot be implemented,and the potential value of water environment big data cannot be fully explored and utilized.For this situation,this paper designs and implements an integrated management and big data analysis platform for surface water environment.It can uniformly collect,process and store large-scale,heterogeneous,and multi-sources water-environment-related information,and analyze,forecast,utilize water environment data based on big data technology.This platform can help managers improve their work efficiency,and provide them with scientific and reasonable auxiliary decision support.Firstly,this paper carries out specific requirements analysis and architecture design for the platform.The entire platform is divided into three parts: data collection and aggregation system,big data analysis system for water quality,and integrated management system for surface water environment.They are independently responsible for data acquisition and integration,analysis and prediction,visual display and comprehensive management.In data collection and aggregation part,this paper designs a unified monitoring data format for the water environment,uses timed tasks with different execution frequencies and logics to perform scalable data collection on multi-source water quality data.Kafka and Spark Streaming stream-processing technologies are used for efficient data processing on heterogeneous raw monitoring data.In order to store massive data,improve the query and analysis efficiency of real-time data and historical data,Redis real-time database is used to store the latest monitoring data,Elasticsearch database is used to store historical monitoring data,and HDFS is used for regular backup.In water quality big data analysis part,this paper designs a combined water quality prediction model based on the traditional time series prediction algorithm Prophet and the deep learning neural network prediction method LSTM.The Prophet model is used to grasp the overall trend of water quality time series data,and the LSTM neural network model is used to fit the remaining non-linear error data.Compared with the real value,the average relative error MRE of the prediction result is 18.98%,which can be put into practical applications.Finally,this paper implements the three parts of the platform,and tests the functions and performance of the platform in a distributed environment.The test results show that the platform can meet the previous requirements design and maintain good performance when performing concurrent data collection,data query,and big data analysis. |