Font Size: a A A

Automatic Incident Detection Method Based On Under-Sampling For Imbalanced Datasets

Posted on:2017-07-09Degree:MasterType:Thesis
Country:ChinaCandidate:M H LiFull Text:PDF
GTID:2322330491962099Subject:Transportation engineering
Abstract/Summary:PDF Full Text Request
The rapid increase in the economy and ownership of car has brought about the dramatically demand growth imposed on transportation system. As a result, traffic operation conditions deteriorate and traffic incidents occur in an increasingly frequency. Timely and accurate incident detection, has great significance on reducing the delay, congestion and accidents caused by traffic incidents, thus improve the road traffic safety and service level.The happen of traffic incidents is accidental in reality and data of traffic incident is far less than the data of normal traffic state, therefore, traffic incident detection is essentially an imbalanced classification problem. Therefore, under-sampling methods to solve imbalanced classification problem were applied to traffic incident detection, and support vector machine was adopted as a classifier to detect incidents, three AID models based on different under-sampling methods were proposed.Firstly, a non-heuristic under-sampling method, under-sampling method based on neighborhood clean rule, is applied to traffic incident detection, namely a SVM AID model based on neighborhood clean under-sampling was proposed to improve the detection performance. Besides, the grid search method and particle swarm optimization algorithm were adopted to optimize SVM parameters.Then, in order to avoid the arbitrariness of determining the sampling rate artificially in non-heuristic under-sampling method, a SVM AID model based on Genetic Algorithm-based Instance Selection (GA-IS) was proposed, which uses the "survival of the fittest" intelligent optimization rule of GA to determine the optimal training set. And compared the application effect with non-heuristic sampling method.Furthermore, considering the long time consumption for large-scale dataset, a SVM AID model based on Genetic Algorithm-based Support vector Selection (GA-SS) was proposed, which just only to learn from the smaller support vector dataset. For large-scale imbalanced datasets, this methodology provides an efficient and effective solution for imbalanced traffic data learning with an SVM.The Singapore AYE simulation database was adopted as the experiment data, and the experiment algorithm was implemented through programming on MATLAB R2011b. The experimental results show that the proposed AID models can improve the effect and efficiency of traffic incident detection, obtain better comprehensive detection performance.
Keywords/Search Tags:automatic incident detection, neighborhood clean under-sampling, genetic algorithm, metaheuristic under-sampling, support vector machine
PDF Full Text Request
Related items