| The development of computer technology, especially the development of artificial intelligence makes data mining an unprecedented development in the information processing areas. The main task of the data mining is getting useful knowledge and information from mass of data through data processing and analysis. So as to provide the decision in the application fields, including electronic shopping, business analysis, disaster relief, health care etc. Classification is one of the most important tasks of data mining, so there are many kinds of classification algorithms, such as neural networks, association rules and decision trees. The decision tree algorithm is the most commonly used classification algorithm, it has fast classification speed, high accuracy and good scalability.In the specific application process, the method of decision tree classification also has many problems, such as the large size of resulting tree and low efficiency. Therefore optimization and improvement for decision tree algorithm have a very significant meaning.This paper provides an overview of data mining, especially data classification and the basics of rough set theory. In addition, a comparative analysis of attributes reduction methods based on rough set theory, as well as a variety of decision tree classification are proposed. For the shortcomings of attributes reduction method and decision tree classification algorithm, the thesis did some meaningful exploration in how to optimize decision tree algorithm. The main work of the paper as follows:(1)This paper improves the commonly used reduction algorithm based on attribute importance with the attribute dependency in rough set tentatively, the improved algorithm has low time complexity under the same reduction ability, and experiment results show that the new algorithm has good effectiveness.(2)This paper obtains the optimal values of G and M which are the parameters of pre-pruning algorithm with Weka, and proposes an improved EBP using Laplace correction. The results show that our new tree pruning method considerably reduces the tree size and increases the accuracy in general.(3)The improved algorithm is applied in the actual analysis system of hospital management, and generates a decision tree by the new algorithm. The nine exported rules provide valuable reference information for outpatient doctors and medical experts. |