Font Size: a A A

Transportation Mode Recognition Model And Its Application Based On Active Learning And Semi-Supervised Learning

Posted on:2020-06-24Degree:MasterType:Thesis
Country:ChinaCandidate:Y T FengFull Text:PDF
GTID:2392330599475068Subject:Transportation engineering
Abstract/Summary:PDF Full Text Request
The transportation mode information has an important value in traffic planning and traffic control management.With the development of big data technology,machine learning algorithms,and the increasing scale of mobile phone users,more and more researches focus on the use of mobile signaling data mining to obtain traff-ic information.Owing to the advantages of no need for participation of users and the completeness of users' travel information,mobile signaling data,one kind of the mobile data,is increasingly used for data mining in the field of traffic pattern recognition.However,because of the different quality of each data source and the complexity of data mining algorithms,the focus and diff-iculty still focus on how to choose appropriate data source and how to design efficient algorithm.Therefore,this paper firstly establishes the data quality evaluation system for data heterogeneity.And then,based on the preprocessing of the mobile signaling data,multidimensional transportation characteristics are extracted.Furthermore,the model,improved manual identification process for traffic patterns and traffic pattern recognition algorithm combined with active learning and Tri-training semi-supervised learning,are developed.At last,this paper use the periodic positioning data,one kind of the mobile phone signaling data,to do a real instance test.All the researches are supposed to achieve several goals that to improve the efficiency and accuracy of traff-ic mode recognition,to improve big data mining technology for traffic,and to provide scientific decision-making basis for the optimizing of city,s future transportation structure.Firstly,data quality evaluation system of mobile phone signaling data is established which is preprocesses as well.According to the characteristics of heterogeneous mobile phone data,the mobile data quality evaluation system is established from three aspects:quality characteristics,sampling characteristics and positioning characteristics.Based on the evaluation of HY mobile signaling data quality,data noise such as "Ping-Pong switching" and "drift" is cleaned.And this paper adopts the combination feature of identifying high-frequency points and long-term points to describe the user travel OD.In order to facilitate data analysis,the cleaned and portrayed mobile phone signaling data is organized into a travel chain that only contains one traffic behavior.Then,the paper extracts the transportation characteristics of the travel chain.Based on the research on the identification of traff-ic patterns by mobile signaling data,the transportation characteristics are divided into four categories:distance,time,speed,and traveler attributes.The commonly used travel distance,average speed,and travel time can be used as semi-supervised training.On the basis of the characteristics,considering the characteristics of mobile phone signaling data,the features of distance class,time class and speed class are further subdivided to form a multi-dimensional feature model,and the calculation method is studied.Finally,the application of the transportation characteristics of the 76,000 travel chains in HY City is applied.Next,the improved manual identification process for traff-ic patterns has been studied.On the basis of extracting the travel characteristics,the existing Bayesian decision tree and the third-party navigation data method are improved,and the manual recognition process of the traffic mode combining the two methods is studied,and the case analysis is carried out.The results show that the improved traffic mode manual recognition process can improve the manual marking eff-iciency by about 35%.Finally,the traffic pattern recognition algorithm combined with active learning and Tri-training semi-supervised support vector machine has been studied.For a large number of unlabeled data,the active learning method combined with the improved manual discriminant process is used to construct the information-rich labeled samples,and the Tri-training semi-supervised support vector machine is trained by using the labeled samples and a large number of unlabeled samples.Designing different sample sets and different classification methods to compared the accuracy of active learning and Tri-training semi-supervised support vector machine.The results show that the active learning structure of the information-rich labeled sample set can reduce the number of iterations of semi-supervised learning.The Tri-training semi-supervised support vector machine can improve the accuracy of the classifier through a large number of unlabeled samples.Combined with the active learning and Tri-training semi-supervised support vector machine algorithm,the mobile signaling data transportation mode can be effectively discriminated.
Keywords/Search Tags:Mobile signaling data, Transportation mode recognition, Character analysis, Active learning, Tri-training semi-supervised, Support vector machine
PDF Full Text Request
Related items