Font Size: a A A

Research On Spatial Co-location Pattern Mining Technology Extension And Parallel Mining Approach

Posted on:2022-12-02Degree:DoctorType:Dissertation
Country:ChinaCandidate:P Z YangFull Text:PDF
GTID:1528306335995069Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
Spatial data mining technology aims to automatically extract useful patterns and knowledge from spatial databases.Co-location pattern mining is an important branch of the spatial data mining research,which intends to discover the association among spatial features.A co-location pattern corresponds to a subset of the spatial features set,the instances of which are frequently located in proximate areas.Relationships among different features can be discovered by the mining of co-location patterns,which may yield crucial insights for various location-based applications.With the proliferation of spatial data collection technologies(e.g.,GPS,remote sensing technology)and spatial database technology,massive spatial data are generated,which bring new challenges to co-location pattern mining.On the one hand,there will be a higher demand for the performance of co-location pattern mining algorithms.However,most existing co-location pattern mining algorithms have high computational complexity and are serial processed with a single machine,therefore are powerless or inefficient to process massive spatial data.On the other hand,instances of features in massive spatial data present not only a huge difference in quantity,but also complex interactions,which is not considered by traditional co-location pattern mining methods.This may lead to the loss of valuable patterns and spatial distribution information.In response to the above problems,we extend the spatial co-location pattern mining technology.First,a new co-location pattern mining algorithm framework and a distributed parallel processing technology are proposed to efficiently discover co-location patterns from massive spatial data.Second,the influence of the differences among the quantity of the instances of features and the complex interaction between them is further considered in co-location pattern mining,so as to discover more meaningful patterns.The main contributions are as follows:1.With the high complexity of the participation index calculation in co-location pattern mining,a column-based calculation method is proposed to replace the method based on table instance.Such a method only searches the participating instances to calculate the participation index without generating the table instance for each pattern.To enhance the participating instances search,some pruning and optimization technique are proposed.Then,CPM-Col algorithm based on column calculation is presented.The complexity of CPM-Col algorithm is analyzed theoretically,and its performance is verified by sufficient experiments.2.A parallel co-location pattern mining approach based on neighbor-dependent partitions is proposed.This approach first divides the neighbor relationships to obtain neighbor-dependent partitions,so that a subtask of co-location pattern mining can be executed on each partition independently and the whole mining task can be performed in parallel using a distributed computing cluster.Based on the above,CPM-Col algorithm is then parallelized on Map Reduce platform.Extensive experiments verified the efficiency and scalability of the parallel CPM-Col algorithm in massive spatial data processing.3.Considering the rarity of features in a co-location pattern,the participation index measure is extended to the weighted participation index measure.Compared to the existing co-location pattern mining methods,the weighted participation index can discover prevalent co-location patterns with or without rare features.The weighted participation index possesses a conditional anti-monotone property,based on which an efficient algorithm is designed.A series of experiments verified the effectiveness of the weighted participation index measure and the efficiency of the proposed algorithm.4.The traditional co-location pattern mining problem is extended by considering the complex interactions between instances.We not only extend the co-located measurement of the instances of features in a pattern,but also consider the aggregation of the instances with the same feature.Spatial co-location pattern mining with coupling relation consideration is presented and its algorithm is developed.The effectiveness and efficiency of the proposed techniques are verified by experiments.Compared to traditional methods,co-location patterns with coupling relation constraint can present richer spatial relationships.
Keywords/Search Tags:Spatial data mining, Co-location pattern, Rare feature, Coupling analysis, Column-based calculation, Parallel algorithm
PDF Full Text Request
Related items