Font Size: a A A

Frequent Pattern Mining Of Taxi GPS Trajectories

Posted on:2024-08-09Degree:MasterType:Thesis
Country:ChinaCandidate:H W DengFull Text:PDF
GTID:2530307088955539Subject:Applied statistics
Abstract/Summary:PDF Full Text Request
With the increase in the number of motor vehicles and the popularity of global positioning systems,it has become convenient to collect vehicle movement data,and the scale of the data has become increasingly large.Vehicle movement data contains rich travel patterns of urban residents.As a basic method for analyzing mobile data,frequent pattern mining can be extended to hot spot area extraction and frequent path mining.The frequent pattern mining research based on taxi GPS trajectory data can grasp the urban road network traffic situation and discover the urban traffic travel pattern,so as to provide valuable auxiliary decision-making information for managers.The current research on frequent pattern mining of trajectories mainly focuses on the improvement of frequent path mining algorithm,without incorporating the trajectory pre-processing technology and hot spot area extraction technology into its system,it means no unified method flow are formed in the practical application,which leads to the difficulty of algorithm reproduction,and the semantic information of the data is not fully explored;for hot spot area extraction,the current research is mainly based on DBSCAN algorithm,which is not applicable to the taxi GPS location data with uneven density distribution;for frequent path mining,conventional clustering methods are limited to the analysis of geographic location information,ignoring the distribution of trajectory changes in time,which cannot accurately grasp the travel characteristics of mobile objects.To address the above problems,this paper fully considers the characteristics of trajectory data and constructs a general framework for frequent pattern mining based on taxi GPS trajectory data.The framework gives the method flow of trajectory pre-processing for the two tasks of hot spot area extraction and frequent path mining,and proposes an algorithm that can be effectively used for uneven data distribution,which solves the problems of the hot spot area extraction task;in addition,this paper improves the frequent path mining algorithm by introducing the time factor based on the TRACLUS algorithm which only considers the space factor.The research of the thesis specifically includes the following four aspects:(1)The general framework of trajectory frequent pattern mining is improved.The basic concepts of trajectory definition,modeling representation and main features are given;the key technique of trajectory clustering,trajectory similarity measure,is introduced in detail from three types;in view of the limitation that the existing frequent pattern mining mainly focuses on frequent path mining,the connotation is extended to two tasks,namely hot spot region extraction and frequent path mining.The method flow from pre-processing to algorithm implementation is developed,and the general framework of frequent pattern mining is established.(2)A pre-processing method for trajectory data based on different tasks is created.Based on the characteristics of trajectory data,suitable preprocessing techniques are selected for the two tasks of hotspot region extraction and frequent path mining.The data cleaning is used to obtain the boarding and alighting points and passenger-carrying trajectories;the calculation of distances and the setting of algorithm parameters are facilitated by converting the latitude and longitude into coordinates on the geodetic plane right-angle coordinate system;the influence of collection data offset is reduced by map matching.(3)A quality threshold clustering algorithm based on neighborhood distance association is proposed to solve the urban hotspot area extraction problem.The algorithm is improved on the basis of the quality threshold algorithm,considering the maximization of the number of clusters in a limited range and the closeness between the cluster centers and the local density,which effectively solves the problem of uneven data distribution.The comparative analysis of visualization and algorithm quality evaluation with the quality threshold algorithm and DBSCAN algorithm shows that the proposed algorithm can clearly find the aggregation location,and its contour coefficient is above 0.65 in several groups of comparison experiments,which is significantly higher than the quality threshold algorithm and DBSCAN algorithm.Finally,the method is applied to the extraction of hotspot areas in Chengdu,and the practicality of the algorithm is verified.(4)The ST-TRACLUS clustering algorithm based on the time factor is proposed to solve the taxi frequent path mining problem.The algorithm is an extension of the TRACLUS algorithm,which introduces the time factor based on the measure of spatial similarity of the TRACLUS algorithm,and proposes the STTRACLUS clustering algorithm by defining the time distance as the key variable for distance calculation based on the concept of Jaccard distance.A comparative analysis of visualization and algorithm quality assessment with TRACLUS algorithm shows that ST-TRACLUS better identifies where taxies are clustered in spatial and temporal sections,and the algorithm has significantly higher multigroup contour coefficients than TRACLUS.finally,the method is applied to frequent route mining in Chengdu to verify the practicality of the algorithm.
Keywords/Search Tags:trajectory frequent pattern, trajectory preprocessing, hotspot region, trajectory clustering, frequent path
PDF Full Text Request
Related items