| Data Mining and Knowledge discovery in database (KDD) is a rapidly emerging research field, relevant to artificial intelligence, database system and statistics. It is the nontrivial process of mining the interesting , potentially useful, valid and understandable knowledge from data. Classification is an important sub-branch of Data Mining, which aims to build the classifier used to predict the class label of new coming data. Decision Tree is the typical model for classification, and its construction algorithms, that are ID3 and C4.5, are proposed by J.R.Quinlan in 1986 and 1993 have been well known.Concept Lattice represents knowledge with the relation between the intensions and the extensions of concepts, and the relation between the generalization and the specialization between concepts, thus it is an efficient tool for KDD. By Introducing equivalent intension into GCL, the Extending Formal of Concept Lattice (ECL) is gotten which represent the knowledge more clearly and distinctly. Based on ECL, classification is discussed in the dissertation. And a pruning approach is proposed, when restrict ECL only to classification. Classification rule induced from the Pruned classification ECL is superior to those from Decision Tree in quality. The pruning can make the size smaller and the inducing of rules more quickly. More importantly, the rules induced from ECL are reduced while the ones from Decision Tree are redundant. And it is proved theoretically and experimentally that ECL is superiority to Decision Tree in classification. However, the size of ECL is too large owing to its completeness, which make it unsuitable for large database.The huge size, non-relation structure of database and the distributed databases propose new challenge for data mining and KDD. As for ECL, the situation is worse. Many researchers proposed the parallel and distributed computing environment as an efficient resource for data mining. Classification based on distributed ECL is proposed in the dissertation, the improvements in time and complexity are expected. The promising approach can be used for centralized databases as well as distributed databases. |