Font Size: a A A

Multi-functional Data Clustering And Its Application On Air Pollution

Posted on:2019-05-10Degree:MasterType:Thesis
Country:ChinaCandidate:Y S ZhaiFull Text:PDF
GTID:2381330611472438Subject:Applied statistics
Abstract/Summary:PDF Full Text Request
In the contemporary society,with the rapid development of science and technology and the continuous improvement of measurement methods,our ability to collect data has undergone a qualitative leap.The data obtained has become more concentrated and the amount of data has grown.In order to better analyze these data,we have the concept of functional data,followed by the research and exploration of functional data analysis methods have been more and more attention.In this paper,the k-centres surface clustering methods based on marginal functional principal component analysis are proposed for the bivariate functional data.Due to the complexity of multivariate functional data,the distance between multivariate functions cannot reflect the difference in function shape.In order to obtain accurate clustering.A novel clustering criterion is proposed where both the random surface and its partial derivative function in two direction are considered.Then apply the proposed multivariate function data clustering method to air pollution data,the collected AQI data were first used to explore the air in recent years in Liuzhou with the help of R language tools from the changes of PM2.5concentration data and air quality distribution.The pollution status,combined with the collected wind speed data,further explored the impact of wind on air quality.Then apply the multivariate functional data clustering method proposed in this paper to air pollution data.The first method is to use MATLAB tools to interpolate PM2.5concentration data into functional data,and then use functional principal component analysis method to obtain principal component scores for the functionalized data and then the k-means clustering is performed on the principal component scores.The second method is based on the first method,forming new functional data by stringing the principal component scores,and then performing functional principal component analysis to obtain new principal component scores,and then k-means clustering is performed on the new principal component scores.This paper lists the clustering results of the two clustering methods into 4 and 5 clustering results,and analyzes each clustering result in light of the actual pollution situation in Liuzhou.The results show that the overall air pollution status in Liuzhou in recent years is mainly fine,and the number of days of mild and mild pollution is increasing.The air pollution problem was the most serious in 2014,and there have been many times of severe pollution.Wind also has a significant effect on air pollution.From the results of the two clustering methods,we can see that the clustering method proposed in this paper can effectively classify different types of weather pollution,when the number of clusters is 5,the division of air conditions is more refined,and the clustering results are also consistent with the real situation.In the end,this paper proposes methods and suggestions for solving air pollution from the aspects of strengthening air monitoring,rational planning of urban layout,and increasing supervision and law enforcement in light of the actual pollution situation in Liuzhou,and looks forward to the future exploration direction of multivariate functional data clustering analysis.
Keywords/Search Tags:Multivariate functional data, Cluster analysis, Air pollution, Principal component analysis
PDF Full Text Request
Related items