Font Size: a A A

The Improved Algorithm Of The Maximal Information Coefficient And Its Applications In Railway Accident Analysis

Posted on:2017-04-08Degree:DoctorType:Dissertation
Country:ChinaCandidate:F B ShaoFull Text:PDF
GTID:1312330512471714Subject:Systems Science
Abstract/Summary:PDF Full Text Request
Railway transportation is an important transportation mode in the whole transportation system.In our country,the development speed of railway is very fast and the railway transportation is coming into a new stage.And the number of passengers and carloads is increasing with the increasing of railway miles.However,at the same time,people and society suffer from great losses at a heavy cost in railway accidents because large and very large railway accidents happen at times.Then it is also an important work to ensure the safety of railway operations.Now,more and more advanced electronic and electrical equipment is applied to the railway system.Then there will be more factors influencing railway safety.The correlation between so many factors should be firstly analyzed.Compared to other correlation criteria,the maximal information coefficient can capture a wide range of relationships due to its two excellent characteristics,generality and equitability.The definition of the maximal information coefficient between two variables and corresponding approximate algorithm are analyzed.Some disadvantages are found.Then,the fast algorithms for calculating the maximal information coefficient of two variables and multi variables are proposed.Afterwards,based on the maximal information coefficient,railway accident analysis and prevention is studied in this paper.To be specific,the main innovations of this paper are as follows.1.The mathematical programming model and the fast algorithm for calculating the maximal information coefficient between two variables in large scale data sets are proposed in this paper.Based on the analysis of the definition of the maximal information coefficient of two variables,the objective and constraints to calculate the maximal information coefficient of two variables are identified.The mathematical programming model is given.To solve the problem of the long computation time for calculating the maximal information coefficient of two variables by the algorithm proposed by Reshef et al.,a fast algorithm for calculating the maximal information coefficient of two variables in large scale data sets is proposed via employing the k-means clustering algorithm in this paper.Two variables are divided into different numbers of bins,respectively.Then the fast algorithm for calculating the maximal information coefficient of two variables in large scale data sets is proposed.The maximal information coefficient of two variables calculated by the proposed fast algorithm reserves the advantages of MIC,generality and equitability.And the computation time of two variables of different types is almost the same.With the increasing of the scale,the increasing speed of computation time is not very fast.The time complexity of the two algorithms is analyzed.The time complexity of the proposed fast algorithm is O(n1.6).However,the time complexity of the approximate algorithm proposed by Reshef et al.is O(n2.4).Then the fast algorithm proposed in this paper is more suitable for calculating the maximal information coefficient of two variables in large scale data sets.2.The definition for the maximal information coefficient of multi variables and the fast algorithm for calculating the maximal information coefficient in large scale data sets are proposed in this paper.Employing the equation of mutual information,mutual information is decomposed into the sum of the mutual information between the dependent variable and independent variables.Then the definition of the maximal information coefficient of multi variables is given.Employing the k-means clustering algorithm,the grid of the dependent variable and independent variables can be obtained.The fast algorithm for calculating the maximal information coefficient of multi variables in large scale data sets is proposed in this paper.Numerical experiments show that the maximal information coefficient calculated by the proposed fast algorithm reserves the advantages of the maximal information coefficient,generality and equitability.The computation time for calculating the maximal information coefficient of multi variables is short and the increasing speed of computation time is also slow.Then the proposed fast algorithm is suitable for detecting multi-variable dependent relationships in large scale data sets.3.The complex network model for analyzing railway accidents is proposed based on the maximal information coefficient.In the complex network model,nodes are influencing factors and edges are generated according the maximal information coefficient of the linked two nodes.The changes of the network structure are analyzed with the variety of the dependence level.Specially,the changes of the degree of nodes,degree distribution,the number of isolated points and sub graphs,and the averaged connectivity degree are analyzed.With the increasing of the dependence level,the important influencing factors of the analyzed factor is identified from many factors.4.A warning method for railway accident prevention is proposed based on the maximal information coefficient.Based on the maximal information coefficient,related influencing factors are sorted according to the importance of correlation.Employing the artificial neural network,fitting curves of different numbers influencing factors are obtained.The best neural network function between the object factor and influencing factors is obtained.Then based on the neural network function,the dangerous zone is given.A method for railway accident prevention is proposed via employing the dangerous zone.When influencing factors come into the dangerous zone,these un-normal factors should be adjusted.Railway accidents can be greatly prevented.
Keywords/Search Tags:Railway accident, Pre-warning, Correlation, MIC, Graph model, Cluster algorithm
PDF Full Text Request
Related items