Font Size: a A A

Research On Monotonic Classification Algorithm Based On Decision Trees

Posted on:2015-01-11Degree:MasterType:Thesis
Country:ChinaCandidate:X XuFull Text:PDF
GTID:2309330461984953Subject:Systems Engineering
Abstract/Summary:PDF Full Text Request
Decision tree algorithm is one of the most widely used inductive learning algorithms, which is an approximation of the objective function of the discrete values. By applying the top-down recursive method, a decision tree is constructed to reveal the relationship among the data. Then the decision rules are generated. Monotonic classification is an important classification task. The monotonic constraints between condition attributes and decision need to be considered in the monotonic task:when the values of one object on all of the condition attributes are not worse than the values of another object, the decision of this object is not worse than the decision of another object either. Hu et al. proposed ordered decision trees algorithm based rank entropy. The algorithm was used for monotonic classification problems. By using this algorithm the monotonic and consistent decision trees on monotonic training samples are generated and the better performance was achieved in the data including some noise. In order to obtain higher classification accuracy and efficiency, the monotonic classification algorithm based REMT algorithm is studied in this thesis. The main work is as follows:(1) Ascending and descending rank mutual information was introduced and their changes under different noise levels were discussed. Then, the decision-making algorithm that respectively generated ascending rank decision tree and descending rank decision tree was proposed by availing the ascending and descending rank mutual information and a classifier was built through integrated two trees with the rule accuracy. The experiments were made respectively in artificial and real data sets. The results show that the algorithm has three advantages:ensuring the monotonic consistency of the decision rules on the monotonic data set; improving the classification accuracy; reducing the decision tree length and the decision rule depth by appropriately relaxing the termination conditions.(2) The idea of the decision forest was borrowed by this thesis and monotonic classification algorithm based on the decision forest was proposed. The algorithm introduced resampling technique and obtained several training subsets through resampling, which from a different point of view built the decision trees in the forest. Then several decision trees that had similarities and could cover the data objects in the original training set with a large extent were obtained. The experiments were made respectively in artificial and real data sets. The results show that the algorithm can reduce the classification rule length and avoid the over-fitting because of the reduction of the training subset size. So the algorithm can be used for the situation of larger data set.The algorithms that built several decision trees and then integrated them are able to get the monotonic and consistent trees on the monotonic training samples. Compared with the single ordered decision tree, it not only improved the classification accuracy and reduced the mean absolute error, but also shortened the length of the classification rules, improved classification efficiency and avoided over-fitting.
Keywords/Search Tags:Decision Tree, Monotonic Classification, Rank Mutual Information, Decision Forest
PDF Full Text Request
Related items