Large-scale Trajectory Similarity Search With Distributed Tries

Posted on:2022-09-08

Degree:Master

Type:Thesis

Country:China

Candidate:L G Weng

Full Text:PDF

GTID:2480306572497334

Subject:Computer technology

Abstract/Summary:

PDF Full Text Request

Trajectory data is a sequence of geographic information obtained by sampling the movement process of a object.With the popularity of GPS devices,the scale of trajectory data has grown explosively,which makes it possible to analyze trajectory data to facilitate people’s life.The k nearest neighbor query based on trajectory similarity is one of the basic operations of trajectory analysis.However,the massive trajectory data make the existing stand-alone algorithms unable to efficiently complete the query task.Distributed schemes can use the resources of multiple machines to speed up the query process,but the most advanced distributed schemes have the problems of waste of computing resources and low query efficiency of local index.In order to solve the above problems,we propose a distributed trajectory similarity search framework REPOSE that runs on Spark to process k nearest neighbor query.The framework supports a variety of trajectory similarity measures,including Hausdorff distance,Frechet distance and DTW distance.REPOSE consists of a global partitioning strategy and a local index.The global partitioning strategy is a novel heterogeneous partitioning strategy,which balances the composition of each partition to achieve load balancing and improve the query efficiency of the local index.The local index uses Z-order to discretize the original trajectory into a reference trajectory,and then uses a reference trie to index the reference trajectories.To further improve query efficiency,REPOSE uses a sequence-independent optimization method to improve the performance of the reference trie.In addition,three powerful lower bounds of pruning(one-side lower bound,two-side lower bound and pivot lower bound)are proposed based on the nature of reference trajectories to reduce the number of computations and use intermediate results to reduce the cost of calculating lower bounds.In order to fully demonstrate the effect of REPOSE,its performance was compared with that of the most advanced distributed trajectory similarity search framework on real datasets.The experimental results show that the performance of REPOSE is superior to the existing distributed trajectory similarity search framework.

Keywords/Search Tags:

PDF Full Text Request

Related items

1	Research And Design Of Gene Similarity Search Method Base On Heterogeneous Network
2	Research On Similarity Search Algorithm Of Heterogeneous Information Network Based On Hierarchical Attention
3	Personalized Top-k Trajectories Search Based On Location
4	Similarity Search On Weighted Heterogeneous Information Networks
5	Research On Top-K Relevant Search In Heterogeneous Information Network
6	Research On Vehicle Trajectory Clustering And Prediction Algorithm Based On Spatiotemporal Data
7	Research On Similarity Search In Large-Scale Graph Databases
8	Learning Trajectory Deep Representation For Trajectory Similarity Computation
9	Research On Trajectory Similarity Measure Based On Movement Features
10	Trajectory Clustering And Anomaly Detection Algorithm For GPS Data