Font Size: a A A

The Research On Filling The Missing Floating Car Data Based On The Multivariate Linear Regression Model

Posted on:2016-11-03Degree:MasterType:Thesis
Country:ChinaCandidate:L LiuFull Text:PDF
GTID:2322330503986971Subject:Probability theory and mathematical statistics
Abstract/Summary:PDF Full Text Request
In real life, missing data problem is widespread. Whether in transport or in the socio-economic study, a sample survey, biomedical research and many other areas, the missing data phenomena are inevitable. Because of missing data, it will not only increase the complexity of the task analysis, but also cause the serious bias of statistical analysis results, which substantially reduces the efficiency of statistical work. Therefore, in order to obtain more complete data, using mathematical statistical methods to fill the missing data is an important part of data processing, and is also one of the important methods to improve data quality. In this article, using a floating car data as an example, we study the method to fill the missing data.In this paper, the main content is that we can get the missing data in road network by matching floating car data with road network in Shenzhen. In order to fill the missing part, we proposed multiple linear regression model to make the data cover a wider range of road network as far as possible, forming the guidelines of road condition release to make people travel conveniently. The main contents include the following parts:Taking into account the spatial and temporal correlation about traffic data, we analyzed spatial correlation of road network under the multi-scale, and got the spatial related factors of missing data interpolation. At the same time, we analyzed the time correlation of the floating car data and determined the scale of the time window, to lay a foundation for interpolating missing data model in the following passage.With the application of spatial correlation, multiple linear regression model is proposed. First, to build model only with combination of the spatial correlation and to make validation analysis by choosing training data have bad effect and low precision. In order to improve the precision, introducing time-related factors to build model and making comparison and verification, we know the multiple linear regression model combining to fill missing data with spatial and temporal relations has more applicability and reliability, and make the conclusion about four applicable cases of the model. At the same time, according to the three hotspot obtained by members of the group aiming at the research of hotspot areas, we separately make traversal filling.The last part is the empirical analysis section. In this article, using one of the hot zones, Futian District as an example, we selected training data to make the model empirical verification. By the accuracy of the empirical data calibration model, we fill the actual missing data of the path and give evidence of historical data of this missing part to conduct traffic release. In this article, we can get a reliable model to fill missing data with combination temporal and spatial correlation.
Keywords/Search Tags:floating car data, the multivariate linear regression model, space-time relativity
PDF Full Text Request
Related items