| With the continuous updating and iteration of location-acquisition and data storage technologies,the collection and storage ability of geographical data is increasing day by day,offering an unprecedented opportunity for geospatial exploration and urban analysis.The in-depth and detailed exploratory analysis of large-scale geographical data can effectively discover and understand spatial patterns and trends.However,the scale of geographical data is becoming larger and larger,which brings some difficulties to the exploratory data analysis,especially in the fields of geospatial analysis,spatio-temporal data mining and visual analysis.The scatterplot,as a common geographical point data visualization technology,often suffer from the visual clutter and overdraw problems caused by the conflict between the limited screen space and the large number of points,making users unable to perceive the latent features of interest from original geographical point datasets.Therefore,it is particularly urgent to simplify the cluttered visualization of large-scale geographical points to help users improve cognitive level and analysis ability.Sampling is a common method to reduce the sizes of large-scale geographical datasets to achieve a high-quality simplified representation.At present,the existing sampling methods focus on point densities,local outliers and blue noise features,which are highly related to geographical coordinates.However,the attribute data is ignored in the course of sampling,which plays a significant role in geospatial exploration.As the basic component and important information source of geospatial analysis,attribute data plays an important role in the field of infectious diseases and environment protection.Therefore,traditional methods do not consider the spatial features related to data attributes,such as attribute distributions and spatial autocorrelations,making it difficult to achieve a high-quality simplified representation of large-scale geographical point data.As an important method of big data analysis,visual analysis integrates the data visualization technology and analysis theory with the user-friendly human-computer interactions,so that it can help users perceive large-scale geographical data intuitively and further improve the efficiency and in-depth of exploratory data analysis.Therefore,the purpose of this paper is to study the simplified representation of largescale geographical point data by using spatial analysis methods,sampling technologies and data visual analysis methods.The main contributions of this dissertation are summarized as follows:Firstly,we propose a visual abstraction method,aiming to preserve the attribute distribution and spatial distribution of large-scale geospatial point data.In this work,we first combine the histogram method with the outlier detection method to obtain the attribute distribution of geographical point data.Then,we propose a sampling technique.In the course of sampling,the priority and backtracking criteria are introduced to make the sampling results preserve the spatial densities of points and the original attribute distribution of the data.In addition,several visualization components are designed to support users interactively evaluate and online analyze the simplified results.Finally,a visual analysis system is implemented to integrate sampling models,visual designs and a rich set of interactions.Case studies and quantitative comparisons based on real-world datasets and interviews with domain experts have demonstrated the effectiveness of our system in simplifying the geographical visualization of large-scale geographical point data and exploring the regional characteristics across different local areas.Secondly,different from the conventional statistical analysis methods,spatial data analysis becomes more complex due to the existence of spatial autocorrelations.Thus,we further propose an attribute-based sampling method for visual abstraction of large-scale geographical points,which does not only preserve the point densities but also maintain their spatial autocorrelations.Firstly,we utilize Moran’s I to capture the spatial autocorrelations of points and present their attribute relationships with Moran scatterplots.Then,we propose an attribute-based sampling model to select a subset of representative points,in which point densities are preserved by Z-order sampling,while the attribute relationships are retained by selecting those points with consistent spatial autocorrelations.Also,a set of visual cues are designed to highlight those ambiguous points whose spatial autocorrelations are drastically changed,and a series of user-guided replacing interactions are provided enabling users to further optimize the sampled points.Furthermore,we achieve a visual analysis system and a set of case studies and quantitative comparisons based on the real-world datasets are conducted to demonstrate the effectiveness of our method in the abstraction and exploration of large-scale geographical point datasets.To sum up,in view of the attribute features of large-scale geographical point data,our study designs several sampling methods to achieve the visual abstraction and analysis of large-scale geographical point data,so as to help users explore the important information hidden in large-scale geographical point data. |