| Bayesian network(BN)is a mathematical model based on probability theory and graph theory.It provides an explicit,graphical and interpretable expression form for knowledge representation and inference under conditions of uncertainty.However,learning an optimal BN is an NP-hard problem.Therefore,researchers have proposed many methods for BNs from the perspectives of probabilistic inference,information theory,structural learning,parameter estimation,feature selection and ensemble learning,combined with heuristic and other learning strategies.But according to the published researches,BNs still have some problems to be solved: 1)Most researches on BNs belong to eager learning algorithms.In general,an eager learning algorithm tries to build the best classifier that can be effective for all testing instances through training data before the arrival of a testing instance.However,it is extremely difficult to learn such a "perfect" classifier,and the classifier cannot take different measures for different classification problems.It can only classify the testing instances based on the knowledge learned from the training data,ignoring the information hidden in the testing instances that may be helpful for classification;2)In the topology structure of BNs,the difference in measuring dependent relationships between attributes in terms of information theory and probability theory may result in the inconsistency of describing conditional independence.For example,researchers usually use the measure function like conditional mutual information to measure the dependence between attributes within the BNs.However,due to its informationize expression form based on information theory,it is unable to measure the probabilistic(in)dependence between specific attribute values.That is to say,the(in)dependence relationship between attributes in terms of information theory does not mean that the one between attribute values in terms of probability theory always holds,and vice versa.The neglect of this difference may affect the knowledge representation ability of BNs;3)Although BNs are also called belief network or causal network,the research on causality in BNs is a controversial topic in the field of artificial intelligence.The definition of causality between attributes is much more complex and subtler than that of correlation.The symmetry of conditional mutual information expression determines that they can only describe undirected correlation,rather than directed causality.Based on the directed acyclic characteristics of BNs,most of the existing researches on BNs usually use the artificial defined arc strategy,which can not reflect the real causal relationship.To address the above issues,the main contributions of this paper are as follows:1)Introducing the conditionally independent and identically distributed(c.i.i.d.)as-sumption to BNs by assuming all instances with the same class are conditionally independent of each other and stem from the same probability distribution,and generalizing the measure functions based on information theory,which are widely used for the structure learning of BNs,to measure the dependencies between attrib-utes(or attribute values)hidden in training data(or testing instances)in a more fi-ne-grained level.On this basis,a semi-lazy Bayesian network classifier(SLB)is proposed.SLB can build a series of class-specific local classifiers to respectively mine the implicit dependency relationships between attribute values in the unlabeled instance.Experiments show that SLB has better classification performance and higher classification efficiency compared to other state-of-the-art algorithms.2)We proved that the difference between information theory and probability theory in measuring conditional independence will lead to the inconsistency of conditional independence description.The criteria of conditional independence(or dependence) between information theory and probability theory are redefined.And a novel learning framework of BNs,called Hierarchical independence thresholding(HIT),is proposed to identify informational independence and probabilistic independence relationships between attributes within the BNs.Based on the recognition results,an adaptive thresholding method is proposed to filter out redundant dependencies.Ex-periments show that HIT can simultaneously improve the results of Bayesian net-work classifiers with high degree of dependence in terms of zero-one loss function,root mean squared error,bias and variance.3)An exploratory research is carried out in the causal relationship of BNs from the perspective of information entropy.This paper firstly defines the mapping relation-ship between the joint entropy function and the joint probability distribution within the BNs from the perspective of the log-likelihood function,and then proposes the class conditional entropy function and local conditional entropy function based on the joint entropy function to identify the causal relationships between attributes in topology.Finally,a label-driven heuristic structure learning method is proposed to build a Bayesian network classifier that can balance labeled data fitting and unla-beled data generalization.Experiments show that the new algorithm has significant advantages over other state-of-the-art algorithms in terms of 0-1 loss function,bias and variance. |