Font Size: a A A

COD Data Analysis Of Wastewater Based On Infrared Spectroscopy

Posted on:2020-04-29Degree:MasterType:Thesis
Country:ChinaCandidate:H ZhangFull Text:PDF
GTID:2370330572970529Subject:Mechanization of agriculture
Abstract/Summary:PDF Full Text Request
Chemical Oxygen Demand(COD)refers to the chemical measurement of the amount of reducing substances that need to be oxidized in water samples.It is an important indicator for evaluating the degree of water pollution.The traditional method for measuring chemical oxygen demand(COD)is fussy.It is cumbersome,inconvenient to operate,and inefficient.The samples studied in this paper are wastewater produced by several groups of sugar cane sugar industry.Infrared spectroscopy method is used to measure the infrared absorption value of wastewater in different wavelength points,aiming to establish the correspondence between infrared spectrum data and chemical oxygen demand of wastewater,to reduce experimental operation,reduce chemical pollution and improve efficiency.The research process of this paper is divided into four parts.The first part is to establish the correspondence between spectral data and chemical oxygen demand by using the one-dimensional linear model modeling method.The second part is establishing the corresponding relationship between multi-linear model and infrared spectrum data,and the third part is the combination of multiple linear models and feature extraction method.The fourth part is the multivariate nonlinear model combined with the feature extraction method to establish the correspondence between chemical oxygen demand and spectral data.Finally,it is concluded that the correspondence between the multivariate nonlinear model and the feature extraction method is the most reliable.Is the final approach adopted in this article.In the first part,the statistical method is used to establish a regression model with the chemical oxygen demand of the water sample with 2022 wavelength points,calculate the error,select the best wavelength point among the 2022 characteristic wavelengths,pick out the best point is 1985th,and then the wavelength point data is used to perform the linear regression prediction and the quadruple cross-validation.Finally,the predicted value is compared with the actual value,and the average absolute error is calculated.The conclusion is that in the case of one element,due to the sample Less,the error is large,not enough to establish a correspondence.In the second part,firstly,all data sets are substituted into the Lasso model to explore the law under multivariate linearity.Secondly,the model is trained,and the four-fold cross-validation and prediction are performed again.The evaluation criterion is the magnitude of the error value,and the model hyperparameter is performed.The optimization found that the prediction error results were not satisfactory.In the third part,the feature extraction of the original data with multiple features is firstly extracted by PCA principal component analysis and truncated singular value decomposition.The feature number evaluation standard is the interpretation variance rate,and the feature extraction data set is used.The four-fold cross-validation is carried out and substituted into the multiple linear regression model to carry out model training.The verification set data is substituted into the trained model for prediction,and the conclusion is that the prediction result of this method is relatively good.In the fourth part,the extracted data is segmented and 80%of the data is used for training.20%of the data set as the test set,divide the 80%data set into four parts,and test the test set with three parts as the training set,and cross each of the data sets to serve as the training or test.Substituting data into support vector regression SVR,SVR-poly,multi-layer perceptron neural network MLP three nonlinear models,each model yields four sets of relative error and average absolute error,take the average,compare the best one is selected.Then,the previously segmented 20%data set is substituted into the best nonlinear model for final prediction,and the prediction result is obtained.The best prediction result is characterized by using the segment singular value decomposition model.Extraction,decline ten,using SVR-poly model for prediction,the relative error is 0.2043,and the average absolute error is 62.61.The purpose of establishing the correspondence between chemical oxygen demand and infrared spectrum data can be achieved,last the optimal model SVR-poly is parameter-tuned to improve accuracy.The parameters after tuning are degree 1,c=1000,coef0=19,the average absolute error after tuning is 60.5,and the relative error is 0.1920.Through the research and exploration of four parts,the corresponding relationship between chemical oxygen demand and sewage on the infrared spectrum data is finally established,and it is only necessary to measure the absorption value of the infrared spectrum of the sewage to obtain the COD value of the specific chemical oxygen demand.The goal is to establish a real-time monitoring system for sewage to realize real-time monitoring of water pollution.At the same time,the traditional chemical measurement method can be used to obtain the chemical oxygen demand value,simplifying the experimental process and is of great significance to environmental protection.
Keywords/Search Tags:chemical oxygen demand, infrared spectroscopy, machine learning, feature extraction, regression
PDF Full Text Request
Related items