Font Size: a A A

Spatialization Of Population Data Based On Multi-Source Geographic Information

Posted on:2020-10-23Degree:MasterType:Thesis
Country:ChinaCandidate:F L ChengFull Text:PDF
GTID:2417330590957533Subject:Human Geography
Abstract/Summary:PDF Full Text Request
Human beings are the main body of natural geographical environment and social and economic activities.The spatial distribution of population is always influenced by natural factors,social and economic factors and historical factors.The traditional population statistics method takes the administrative unit as the statistical unit and assumes that the population distribution in the administrative unit presents a balanced distribution state,ignoring that the population distribution is the result of a series of factors such as natural environment,social environment and economic environment.At the same time,the traditional demographic data not only has the problem of low spatial and temporal resolution,but also is difficult to match with the natural geospatial units in the actual application process,which is not conducive to the fusion and use of multi-source data such as nature,environment and ecology.Therefore,it is necessary to obtain high-precision population spatial distribution information,and the spatialization of population data is one of the effective ways to solve the above problems.This paper selectes yuexiu district,liwan district,tianhe district,haizhu district,baiyun district and huangpu district(excluding the original luogang district)of Guangzhou as the research area and takes the population distribution as the research object.With ArcGIS 10.2?Python language and SPSS 22.0 as the main auxiliary tools,the population data,administrative division data,NPP/VIIRS night lighting data,road network distribution data,housing price data,POI data,building area,land use data,digital elevation model and other data within the study area are collected.Firstly,the influencing factors of population spatial distribution are identified based on the GeoDetector,and irrelevant factors are eliminated.Then,the population data of150 m grid in the study area is spatialized based on a single population spatial model and the idea of zoning modeling.The main conclusions are as follows:(1)In the process of shadow screening of population spatial distribution,this paper uses the GeoDetector to detect the factors selected and detect the interaction.The results showed that:except for weeding land index and water area index,other factors all pass the significance test of 0.05,indicating that within the range of the study area,the influence of grassland water area on the spatial distribution of population was almost negligible;The q value of the explanatory power of government agencies and social organizations on the spatial distribution of population is the largest,followed by public facilities,and the explanatory power of other construction land indexes is the smallest;On the other hand,from the interaction detection results,the interaction between natural factors is far lower than the influence of social and economic factors,and the interaction influence of the combination of natural factors and social and economic factors is also greater than the influence of natural factors;Most of the influence shadows play the role of double factor enhancement,and only a few of them play the role of nonlinear enhancement.There are no independent and nonlinear weakening factors.(2)Using land use data and night light data,a population spatialization model is constructed based on general multivariate regression theory to realize the spatialization of 150m grid population data in the study area.The results show that the relative error percentage between the simulated population of most streets and the actual statistical population exceeds100%.The correlation coefficient R is 0.06,and the goodness of fit R~2 is only 0.0039.As can be seen from the scatter plot(5-2),the simulation results of most streets(towns)have a large deviation from the actual population,with poor simulation effect and accuracy.Therefore,the simulation results obtained by this method deviate greatly from the actual situation,which cannot meet the requirements of this study and is not suitable for the simulation of population spatial distribution in the study area.(3)In the process of using random forest model,in this paper,the population spatial distribution of shadow to the selection,mainly to eliminate the construction land index(urban land,rural land and other construction land index),and more precise housing area as an alternative,then build a random forest model,to simulate the population spatial distribution area in the studied area.The correlation coefficient between simulated population and actual population is 0.774,and the correlation is more significant;then the average relative error is calculated,the error result is 30%.The analysis shows that the absolute relative error percentage of 33 streets(towns)is more than 50%,which is the reason for the overall high model error.In addition,compared with the general multiple linear regression simulation results,the simulation accuracy of the random forest model has been significantly improved,which is in line with the actual population distribution.(4)Based on the idea of partition modeling,the research area is divided into densely populated areas and non-densely populated areas by using the method of population agglomeration and combining with our own research needs.Meanwhile,the population data are spatialized by using stepwise regression analysis and random forest model.Then,the optimal results of partition modeling are combined.Based on the street(town)scale,correlation analysis,regression analysis(goodness of fit R~2)and error analysis are conducted on the results of partition modeling,the simulation results of a single random forest model and the actual statistical results.According to the analysis results,the correlation coefficient between the results of zoning modeling and the actual population of streets(towns)is 0.834,and the R~2 of goodness of fit is equal to 0.695.Both the correlation and goodness of fit are better than the single random forest model.In terms of error analysis,the relative error percentage of the simulation results of partition modeling is within(-30%,30%),which is 51.89%,higher than the simulation results of a single random forest model.On the other hand,for streets(towns)with an absolute relative error percentage of 50%,the results of partition modeling are only 18,accounting for 16.98%of the total number of streets(towns),which is significantly lower than the 31.13%of the single random forest model.(5)This paper from the mathematical model of single and two ways to realize the population in the study area partition modeling data space,the results show that the partition modeling,choosing the appropriate population distribution impact factor and appropriate population space data model,to improve the spatial accuracy of demographic data,makes the spatial distribution of population and more in line with the actual situation.On the other hand,this result is consistent with the measures proposed by Bai Zhongqiang,Dong nan et al to improve the spatial accuracy of population data.
Keywords/Search Tags:Multiple sources of geographic information, Partition modeling, Random Forest model, Stepwise regression method, Spatialization of population data
PDF Full Text Request
Related items