| The Bayesian network(BN)is a product of the combination of probability theory and graph theory,and is an efficient tool for solving uncertain artificial intelligence problems.Learning the structure of BNs entails determining the dependency relationships between the nodes of the BN from the dataset.This is a highly important challenge and has been extensively studied in the past few decades.However,learning the optimal BN structure from the data has been proven to be an NP-hard problem.The independence assumption is one of the most effective approaches to address that issue,it can greatly simplify the topology of BN and enhance the estimate of conditional probabilities.Bayesian network classifiers(BNCs)are special types of BNs designed for classification problems.Naive Bayes(NB)is the simplest BNC,it assumes that the attributes are conditionally independent given the class.However,in reality,attributes in many learning tasks are correlated to each other.Thus,NB’s conditional independence assumption may impair its classification performance.Structure extension is the most direct and effective method to relax the independence assumption of NB,since attribute dependencies can be explicitly represented by directed edges in the structure.Among the BNCs based on structure extension to relax the NB’s independence assumption,the 6)-dependence Bayesian classifier(KDB)received extensive attention from researchers since it can represent 6)-dependence relationships between attributes.KDB applies information-theoretic metrics,e.g.,conditional mutual information to build its network topology,and controls its bias/variance trade-off with a single parameter 6).However,in the process of building the network topology,some non-significant dependency relationships are neglected and the implicit independence assumption is introduced.Unverified independence assumption may result in suboptimal network topology and biased estimates of conditional probability.Therefore,this paper focuses on learning the dependencies between attributes while verifying the rationality of the independence assumption in the network topology.To address this issue,the 6)-independence Bayesian classifier(KIBC)proposed in this paper extends the definition of conditional mutual information to learn dependence relationship conditioned on class variable or predictive attribute.In addition,the network topology learned from the training data cannot accurately represent the dependencies implicated in different testing instances.Therefore,to make full use of the information in different testing instances,KIBC applies the local mutual information and local conditional mutual information to learn a corresponding local model for each testing instance.The dependency relationships represented in the two different network topologies will be more appropriate for fitting labeled and unlabeled data,respectively.Finally,KIBC applies ensemble learning strategy to combine models learned from training data and testing instance respectively to achieve classification.To evaluate the effectiveness of our proposed KIBC,we compare KIBC with other competitors on 40 UCI datasets in terms of zero-one loss,root mean square error(RMSE),bias and variance.Furthermore,the Friedman and Nemenyi tests are also used to analyze the experimental results.The results show that KIBC achieves significant advantage compared to a range of state-of-the-art single-model BNCs(e.g.,SKDB and CFWNB)and ensemble BNCs(e.g.,WATAN,IWAODE,WAODE-MI,TAODE and DWAODE). |