Font Size: a A A

Research On The Errors In Rasterization And Their Mechanisms

Posted on:2021-04-24Degree:MasterType:Thesis
Country:ChinaCandidate:Q ZhaoFull Text:PDF
GTID:2370330626453909Subject:Geological engineering
Abstract/Summary:PDF Full Text Request
Vector data structure and raster data structure are two commonly used data structures in Geographic Information System(GIS),and also the main expression ways of spatial data.The vector structure has the characteristic of "obvious position,implicit attribute",while the grid structure has the characteristic of "obvious attribute,implicit position".In the process of practical application and operation,the data of the two structures need to be converted frequently.In recent years,with the rapid development of geospatial information acquisition technology(RS and GPS)and computer technology,raster data has increasingly become the main data source format in various industries.Therefore,the conversion of vector data structure to raster data structure has attracted close attention,and rasterization has become a hot topic.Rasterization of vector data is a process with information loss,including the change of shape,position,area,and so on,which are called errors.Among them,the area change is the research focus of scholars in various industries.There are three kinds of pixel coding methods in the process of rasterization,including Rule of Cell Center(RCC),Rule of Maximum Area(RMA)and Rule of Maximum Combined(RMCA).These three coding methods will cause different errors.However,a little work has been done on the comparison of rasterization errors caused by three kinds of pixel coding methods,and there is even less discussion on the mechanism of different errors caused by three kinds of pixel coding methods.In view of this,with ArcGIS software,three pixel coding methods were used to rasterize 1:100,000 land use vector data of Beijing-Tianjin-Hebei Region in 2018 into 13 grid cell sizes from 50 m to 6km.The accuracy loss of rasterized area was calculated by the Weighted Average Method and the Error Evaluation Method Based on Grid Cells(EEM-BGC)respectively,then the accuracy loss from RCC,RMA and RMCA compared and analyzed,and finally,the error mechanisms of the three coding methods were explored by means of the formula derivation of accuracy evaluation method and the principle of coding method.The main analytical work and conclusions of this paper are as follows:(1)Based on the Weighted Average Method,the errors in different coding methods are compared and analyzed.The rasterization error evaluated by the Weighted Average Method actually reflects the area change of the whole data,that is,the change of the study area boundary before and after rasterization.Among them,in the range of selected rasterization scales,the rasterization errors of RCC are the smallest(0.04% ~-0.10%),and the errors fluctuate up and down around the value of 0 as the scale increases,and the errors fluctuates around the value of 0 as the raster scale increases.The rasterization error in RMCA is the second,and that of RMA is the largest.And the errors of the two methods increase with the change of scales.(2)The mechanisms of rasterization error based on Weighted Average Method were analyzed.The fundamental reason for the different errors in the three coding methods is that the retention and loss of the grids on the boundary line are different with the three coding methods.For the rasterization of RCC,the probability that the center point of the raster cell on the boundary line of the research area falls in and out of the boundary line is 1/2,and the number of retained raster cells is close to the number of lost raster cells,resulting in the positive and negative offset.Therefore,the errors are the smallest and fluctuate around the value of 0.For the rasterization of RMCA and RMA,the grid cells on the boundary line of the research area are divided into two parts: inside and outside the boundary line.The probability that the patch area of the land attributes inside the boundary is greater than the blank area outside the boundary is less than 1/2.The number of reserved grid cells is less than the number of lost grid cells,moreover,as the grid size increases,the number of reserved grid cells gets less and less,and the negative value is greater than the positive value.Therefore,the error becomes larger and larger,and the error values are negative.Among RMCA and RMA,the retention probability of RMCA is greater than that of RMA due to the advantage of the combination.(3)The error distribution and error causes of the three coding methods were compared and analyzed by the the Error Evaluation Method Based on Grid Cells(EEM-BGC).The error in RCC is the largest(9.26%-55.04%),followed by RMA(7.051%-43.92%),and the error in RMCA is the smallest(5.96%-36.84%).The mechanism of the error is that the rasterization error evaluated by EEM-BGC is a local error,that is,the error of every grid cell.Before and after rasterization,the change of the overall area of the study area is derived from the change of the area of each grid cell.In a grid cell,RMCA can always represent the feature with the largest area in the grid,and the proportion of its area in the grid can always maintained the maximum proportion,so the rasterization error is minimum,followed by RMA.The feature properties selected by RCC can only be the feature in the center position of each grid cell,the proportion of its area in the grid cannot be determined,but it can be determined that its proportion is less than or equal to the ratio of the former two,so its rasterization error is the largest.(4)The rasterization errors of the two error evaluation methods were compared and analyzed.The results in different error evaluation methods vary greatly.For the three coding methods,the errors by EEM-BGC are obviously greater than those by the Weighted Average Method.The variation of rasterization error evaluated by EEM-BGC is more obvious with the change of scale(5)Results in two evaluation methods showed that the rasterization error changes most slowly in the scale interval of 200 m ~ 800 m,and the error can be predicted and controlled.It is suggested that for the vector data of land use in the Beijing-Tianjin-Hebei region,the appropriate scale for rasterization is 200 m ~ 800 m.In summary,the rasterization errors in three pixel coding methods were calculated by the Weighted Average Method and EEM-BGC,and the error distribution was compared and analyzed,and the error mechanism was further discussed.The conclusions and research ideas of this paper have a good reference value for the future research of rasterization.
Keywords/Search Tags:rasterization, coding method, error, mechanism, land use data
PDF Full Text Request
Related items