Font Size: a A A

Researcn On Intormation Mining Of Taxi GPS Data

Posted on:2019-08-26Degree:MasterType:Thesis
Country:ChinaCandidate:L Y ChenFull Text:PDF
GTID:2382330545952218Subject:Electronic and communication engineering
Abstract/Summary:PDF Full Text Request
With the widespread use of taxis and the rapid development of GPS and other positioning technologies,a large number of GPS trajectory data are collected by the information center.GPS data has great significance for residents' travel behavior,taxi drivers' habitual behavior,urban traffic construction and planning,taxi scheduling and so on.Based on GPS data of Chengdu taxis,the residents' travel behavior and hot spots were analyzed,and taxi drivers,behavior was explored on the big data platform of Spark.The main tasks are as follows:1.A data preprocessing method based on Spark was designed.A lot of noise data were contained in the original data sets which can't satisfy the requirements of the experiment,so the original data sets needs to be preprocessed.On the big data platform of Spark,a block deduplication algorithm was designed to clean redundant and erroneous data.According to taxi number and time,the data sets were sorted twice.All the residents'track points of getting on taxi and getting off taxi in data sets were extracted and made to be persistent.The experiment obtained secondary sorting data sets and data sets of occupant departure and tracking points.2.The travel time and the spatial distribution of Chengdu residents were studied.In the distribution of residents' travel time,the number of residents' total travel volume,the residents' travel volume every hour and taxi load rate characteristics and other indicators were analyzed.The residents' getting on or getting off peak periods on a week were excavated.In the spatial distribution of residents' travel,based on k-means clustering algorithm,the most suitable parameter k was selected through inter-class distance and penalty item.The residents' track points of getting on taxi and getting off taxi were clustered by using the model.The hot travel area of residents in Chengdu was explored and visualized in real-view map of Chengdu.The experiment obtained the spatial and temporal distribution of Chengdu residents' travel.Using this result,the law of residents'travel activities can be found.3.The research method of taxi driver behavior based on taxi GPS data was proposed.Ten characteristics of taxi driver's behavior preference were constructed from taxi GPS data,a behavior model for taxi drivers with high and low passenger loading rates was established.Features were selected through univariate feature selection,recursive elimination,Gini coefficient and other feature selection methods.LR and RF training models were used to judge the validity of the features.Experiments show that there is an obvious distinction between driver behavior with high passenger loading rate and driver behavior with low passenger loading rate.The driver's behavior with low taxi load rate can be guided by the driver with high taxi load rate,which can increase the profit of the taxi driver.The research results of this paper can not only evaluate the traffic characteristics of Chengdu,but also predict the travel needs of residents reasonably.The mining of hotspots can provide scientific reference for urban applications such as Chengdu's infrastructure planning,shopping mall location,and land value assessment.Moreover,the results of this study on the behavior of taxi drivers provide taxi drivers with a method to increase the passenger loading rate,which has a very high application value.
Keywords/Search Tags:Taxi GPS data, Big data Spark, Residents travel, Clustering, Taxi driver's behavior
PDF Full Text Request
Related items