Font Size: a A A

Optimization And Acceleration Of Spatiotemporal Ripley's K Function For Enabling Massive Point Pattern Analysis

Posted on:2020-11-20Degree:MasterType:Thesis
Country:ChinaCandidate:Y WangFull Text:PDF
GTID:2370330590976776Subject:Cartography and Geographic Information Engineering
Abstract/Summary:PDF Full Text Request
With the broad application of sensors and the development of information infrastructure,increasing point of interest(POI)datasets become available in both spatial and temporal dimension,which form the data basis for research of various natural phenomenon and social events in the real world.As a representative method of point pattern analysis,space-time Ripley's K function has been regarded as a powerful approach to investigate spatiotemporal distribution of point process at multiple scales.However,space-time Ripley's K function is computationally intensive for massive point-wise comparisons and complex edge correction in estimations and simulations.As data volume grows up,the time cost of space-time Ripley's K function rises sharply,which impedes its application for large scale points of events.Parallel computing technologies based on multi-core CPUs and many-core GPUs have been leveraged to accelerate the purely spatial Ripley's K function,and related experiments have demonstrated the substantial acceleration.But parallel Ripley's K function methods were limited by storage capacity of standalone computer and couldn't fit well in current big data pipeline.Meanwhile,existing distributed spatial data processing systems are unable to fully support spatiotemporal objects,which is required in space-time Ripley's K function.To fill this gap,this study presents a distributed computing method for space-time Ripley's K function to lower the barrier of spatiotemporal point analysis for massive POI datasets,and several strategies are involved: 1)spatiotemporal index is utilized to narrow down query scopes and quickly retrieve point points that satisfy the spatial and temporal threshold;2)spatiotemporal edge correction weights are reused by 2-tier cache,and repetitive computation in estimation and simulations will be avoided;3)spatiotemporal partitioning is adopted to decrease data redundancy and support nearbalanced distributed processing for space-time Ripley's K function;4)customized serializer for spatiotemporal objects and indexes is developed to provide compact representations of byte arrays,and reduce overhead of data transmission and serialization / deserialization.The former two strategies aim to reduce time complexity of space-time Ripley's K function,and the latter two strategies are designed to improve performance of space-time Ripley's K function in distributed systems.Experiments have shown the efficiency and scalability brought by the optimization and acceleration strategies.Besides,the impacts of input parameters for space-time Ripley's K function on the efficiency and effects of results have been discussed,and an application case was presented to provide reference for spatiotemporal point analysis.Based on optimization strategies mentioned above,space-time Ripley's K function visual analytics framework was designed in this study,and a prototype system was implemented to show feasibility and potential value of the methodology proposed in this study.
Keywords/Search Tags:point pattern analysis, big data, distributed computing, spatial index, data partitioning
PDF Full Text Request
Related items