Font Size: a A A

The Research On Data Mining Of WWTP And Related Technologies

Posted on:2008-10-12Degree:DoctorType:Dissertation
Country:ChinaCandidate:X D LiFull Text:PDF
GTID:1101360215479778Subject:Environmental Engineering
Abstract/Summary:PDF Full Text Request
The phenomenon of "data rich, but the information is deficient" in urban waste water treatment plant (WWTP) is serious. As a complex industrial process, waste water treatment process is different from business, finance and biology in data characteristics: 1. Large amount and dimension, strongly coupled; 2. process noise and uncertain; 3. dynamic and type varied; 4. multi-temporal and incomplete; 5. Multi-modal. Because of these backgrounds, several techniques in following different aspect: data preprocessing, nonlinear dynamic analysis and forecast of influent time series, fault diagnosis and sign pattern mining, construction and extension of the IWA COST simulation benchmark are investigated in this dissertation. The main contributions of this dissertation are described as follows.1. In data preprocessing phase, a series of studies of data integration, cleaning, transformation and reduction are made, and a lot of corresponding algorithms are proposed. Based on these, subject and application oriented data preprocessing technique is presented, which is used to design 3 different data preprocessing course.2. In nonlinear dynamic analysis and forecast of influent time series phase. Firstly, the phase space reconstruction techniques of Grassberger-Procaccia (GP) algorithm, False Near Neighbor (FNN) method, Cao method, Autocorrelation Function method, Mutual Information method and CC algorithm are studied, and are used to reconstruct the phase space. Based on these, using the largest Lyapunov exponent, Close returns plot (CRP) and surrogate data analyses, it is concluded that influent time series of WWTP is chaos. Based on this conclusion, the time series is forecasted. A neural network (NN) is used to learn and train according to the results of phase space reconstruction, and then a good fitted of input/output NN model is gain. The trained NN model is used to forecast the influent time series of WWTP, and the results indicate that reasonable forecasting is achieved through such a method.3. In fault diagnosis and sign pattern mining phase. Because of the unbalanced distribution of the fault classes data quantity or importance, the risk functional RWLOO with weight coefficient based on leave-one-out errors is presented; then Genetic Algorithm (GA) is used to globally optimize the risk functional RWLOO Because of the size of the data is large, a simple algorithm of RWLOO is presented to reduce the amount of calculation. The improved Support Vector Machine (SVM) is used to classify dataset of WWTP, and the results indicate that compared with the standard SVM and neural network (NN), the improved one can gain higher classification accuracy. There were some signs before faults. For distinguishing these signs from normal quickly, to apply necessary measures, then avoid fault, fault sign pattern mining is proposed. Firstly, serial pattern dissimilarity measure is used to distinguishing fault sign from normal. Then based on dissimilarity measure, a fault sign pattern mining arithmetic based on sliding window is proposed. The practical application of the fault sign pattern mining arithmetic indicates that fault sign pattern can be identified beforehand.4. In construction and extension of simulation benchmark phase. Firstly, the simulation benchmark, proposed by International Water Association (IWA) and European Co-operation in the field of Scientific and Technical Research (COST), is studied. Then the IWA COST simulation benchmark is extended to simulate faults of activated sludge treatment process. The highlight of the extension is activated sludge bulking model: kinetic selection theory, bacterial decay theory, nutrient diffusion theory and storage theory are included in the reactor part; Double-exponential settling model is improved, according to filamentous backbone theory. The extended model can simulate bulking caused by influent, aeration rate, organic substrates and N-NH concentration. The simulation results conform to theoretical analysis. Further, extensions include faults caused by sensors and actuator. The extended model can simulate major faults of activated sludge treatment process, which illustrates a good prospect of application in process modeling, construction and evaluate of control strategy, staff training, trend prediction and environmental risk assessment.
Keywords/Search Tags:Urban waste water treatment plant, Data mining, Nonlinear dynamic analysis, Forecast of influent time series, Fault diagnosis, Fault sign pattern mining, Extension of the COST simulation benchmark
PDF Full Text Request
Related items