Font Size: a A A

Research And Application Of Decision Tree Classification Algorithms Based On Track Geometry Car Inspection Data

Posted on:2010-05-31Degree:MasterType:Thesis
Country:ChinaCandidate:J L FengFull Text:PDF
GTID:2132360275473165Subject:System theory
Abstract/Summary:PDF Full Text Request
Data mining is a method and technology which can discover underlying rules and extract useful knowledge. In recent years, data mining has attracted widely attention and became the most active part in the research of information system and computer science.Data Mining technique face to the application from the first. In many fields, Data Mining is a fashionable word, particularly in the realms such as bank, telecom, insurance, transportation and retail etc. But Data Mining is few used in data analysis of track geometry car inspection data realm. It produced large numbers of track geometry car inspection data in railway track inspection, and it is expected to be mined to find the latent rule to analyze and forecast the data of future. Therefore, this paper put real track geometry car inspection data as example to expatiate the meaning, status, and inconsequence, and put forward the amelioration assume to analyze and forecast the colossal data of Track geometry car using classification arithmetic based on decision tree.The most famous classification algorithm is Decision Tree, which is a tree-structure used for classification. Each internal node of Decision Tree represents one test of a property, while each edge represents the result of the test and each leaf represents a class or a class distribution. The node on the top is the root node. Because of its high efficiency, fast speed ,strong intelligibility, good Simplicity and so on merits, Decision tree is used most widely in the massive data environments.The research situation and hotspot of Decision Tree are roundly introduced in this paper, furthermore, the ID3 classification algorithm and the C4.5 classification algorithm are typically analyzed. Based on this, an improved classification algorithm named QC4.5 which put forward two strategies to improve C4.5 algorithm to deal with the continuous Properties, based on the analysis of time complexity and space complexity of C4.5 algorithm. Based on the UCI Knowledge Discovery in Databases Archive and UCI Machine Learning Archive as experiment data, this paper compares C4.5 with QC4.5(the new algorithm) on the execution efficiency, and it can be see that QC4.5 is better than C4.5.In addition, based on the in-depth research on Decision Tree classification algorithms, a system of track geometry car inspection data is developed in need of classification of track geometry car inspection data, and as an universal data mining platform, it could be applied in all fields.
Keywords/Search Tags:Track geometry car inspection data, Data Mining, Decision Tree, ID3, C4.5, QC4.5
PDF Full Text Request
Related items