Font Size: a A A

Statistical Analysis And Visualization For Multidimensional Environmental And Geochemical Data

Posted on:2018-02-25Degree:DoctorType:Dissertation
Country:ChinaCandidate:J T ShiFull Text:PDF
GTID:1311330512487301Subject:Environmental Science
Abstract/Summary:PDF Full Text Request
In the fields of environment and geochemistry,more and more data can be obtained in the process of experiment and research.Those data not only contain a large number of features,relational information and classification information,but also have the characteristics of inherent connection,multidimensionality and nonlinearity.It is difficult to analyze these multidimensional data directly by the traditional geochemical methods.Therefore,the multidimensional environmental and geochemical data are used to be the research object,a variety of methods in the field of machine learning and pattern recognition such as clustering algorithms and dimensionality reduction are used to make clustering analysis of the linear relationship of hydrogeochemical data and nonlinear mapping of the nonlinear relationship of oil and geochemical data.The methods of dimension reduction and end member mixing analysis are combined to carry mixed source analysis for multi-dimensional geochemical data.The multidimensional visualization are carried out to demonstrate the relationships of multidimensional geochemical data in the low dimensional space(2D or 3D),in order to reveal the geochemical distribution and significance.The main research results are as follows:1.Likeng landfill and the surrounding area in Guangzhou as the study region,the chemical types of groundwater were classified by the methods of Shug Kalev classification and the Piper trilinear diagram.In order to demonstrate the relationships between groundwater components and the degree of familiarity directly and delicately,Hierarchical analysis method and K-means clustering algorithm were used to analyze the datasets of groundwater major ions,mircro-elements and organic pollutants of PAHs.The groundwater chemical types of groundwater in the study area are mainly HCO3-Ca type,HCO3-Na+K type and Cl-Na+K type.At the same time,the chemical types of water in the dry season and the wet season will change,which indicates that the ions exchanges and ions mixing exist in the process.Those two kinds of clustering algorithm is a good demonstration of the relationships and the degree of similarity of the samples,and coincide with the actual geological background and hydrological conditions.It shows that the hierarchical analysis and k-means clustering algorithm could make an effective clustering analysis of multidimensional geochemical data,and the clustering results are intuitive,clear and easy to understand,The effect is much better than that of descriptive classification.2.The phenomenon of oil and gas mixed source is common.It is very important to find out the composition and source of the end member in mixed source oil,which have important practical significance for the oil and gas exploration.The self-organizing mapping and Sammon mapping algorithm were used to make a nonlinear mapping analysis on the 74 mixed oil samples containing 18 biomarker ratios features and 2 stable isotope features.The classification results of SOM was visualized.The result is that there are four categories include samples from Shublik Formation source rock,samples from Hue-GRZ source rock,samples from mixtures of Prudhoe Bay Field and nearby pools and samples from Kingak Shale source rock.The concerete classification results of SOM and Sammon mapping were compared with the results of alternating least squares method and multidimensional scaling.It can be seen from the comparison that the SOM and Sammon mapping have a better classification effect on the nonlinear data in oil and gas geochemistry.In the actual application,it is also necessary to select the appropriate algorithm to meet the actual requirements,which make a reasonable conclusions and explanations.It is helpful to reveal the geochemical significance of geochemical data of oil and gas.3.Principal component analysis(PCA)and End member mixing analysis(EMMA)were used to make a mixed source analysis on the datasets of groundwater major elements,trace elements and organic compounds in groundwater.The results are compared with the qualitative results to identify the sources of pollution in the groundwater.PCA was used to dimensionality reduction for the datasets of major elements,micro-elements and PAHs respectively.The original indexes are combined into new unrelated comprehensive indexes to keep the main features.At the same time,according to the variance contribution of each principal component,the principal component is selected and the number of end members was determined by EMMA.It is concluded that there are two pollution sources: one source is life and agricultural non-point source pollution,another source is leachate pollution,and the way of leachate polluted groundwater is mainly through: along the fault(F7,F9)to the underground migration of landfill,causing serious groundwater pollution near the fault;leakage directly on the ground.In addition,the sources of PAHs are also affected by atmospheric rainfall and industrial pollution.The results are consistent with the geographical location,hydrogeology,climate and other natural conditions of the study area.4.The method of independent component analysis combined with end member mixing analysis were used for the pretreatment of PAHs in groundwater.The independent component is calculated by using the method of non Gauss maximization,and the number and location of the end members were determined.After analysis,the source of PAHs from the method of ICA-EMMA is the same to the result from the method of PCA-EMMA,which is also verified that PAHs in the groundwater around the Likeng landfill are mainly from the landfill leachate,atmospheric precipitation,urban domestic sewage and industrial wastewater.Although there is no direct discharge of leachate to groundwater,but landfill leachate by the surrounding water in different degrees of pollution,which showed the hydrogeochemical characteristics of different natural conditions,and be consistent with the geographic location hydrological geology and climate of study area.Therefore,the results show that the two methods have good mixing effect on the geochemical data with linear relationship.
Keywords/Search Tags:Multidimensional geochemical data, Dimensional deduction, Clustering analysis, End member mixing analysis, Nonlinear mapping
PDF Full Text Request
Related items