Font Size: a A A

Research On Causality Discovery Algorithm Based On Bayesian Network

Posted on:2021-04-10Degree:MasterType:Thesis
Country:ChinaCandidate:F GaoFull Text:PDF
GTID:2480306548494464Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
In order to understand the mechanism of action between things and make them serve for human activities,we must understand the relationship between different things,among which causality is an important universal relationship between things.Exploring the causality between variables in the world can help people better understand the mutual influence of things.However,the relationship between things is always complex,and the causality is not obvious.With the help of experts in different fields,we can manually model systems in various fields.However,in the absence of expert knowledge,we need to automatically model based on some observation data obtained in production and life.With the advent of the era of big data,more and more information and data can be obtained.It is possible to find causality directly from the data.On the other hand,the information contained in the data is complex and diverse.To find causality from the big data,some special technical means are needed.Bayesian network can intuitively reflect the interdependence between variables,and its dependence intensity can be described by conditional probability distribution table.It is a very suitable model for causal discovery.In this paper,we discovered causality based on Bayesian network,and developed a causal reasoning software.Our main achievements include:(1)Causality discovery based on K2 algorithmWe used K2 algorithm to learn Bayesian network.Considering that the accuracy of K2 algorithm for learning Bayesian network largely depends on the predefined node order,we proposed a method based on node priority to sort variable nodes.Compared with random order and MCMC algorithm for learning Bayesian network directly,in 1000 times repeated experiments,the best number of times of our algorithm is 580.At the same time,we applied the algorithm to a flight-weather data set of American domestic airlines,and established the cause and effect diagram structure between flight delay and the weather factors such as fog,humidity,low pressure,rain and snow,low temperature,thunder and low visibility.(2)Causal discovery based on interventionIn this paper,the structure intervention is used to introduce the intervention data,and the causal network is learned by iteration.In order to find out the most disturbing nodes in the current structure,this paper proposes a method of disturbing node selection based on the similarity coefficient of jacquard.By cutting off the influence of different nodes on the sub nodes and comparing the difference between them and the sample data generated by the original network,this method can find out the node that has the greatest influence on other nodes in the network,and take this node as the intervention node,which can achieve the best intervention effect.Based on the observation data set with 500 cases,we have carried out 9 times of iterative intervention after causal discovery.The measurement of the difference between the learned Bayesian network and the real network is 7,7,9,8,4,5,9,6,4 respectively,showing a downward trend as a whole.(3)Code implementation of causality inferenceAfter the establishment of the graphical causal model,the code puts forward the hypothesis of causal relationship between variables based on the do calculus criterion,then estimates the hypothesis,obtains the conclusion of causal inference,and finally verifies the conclusion through the secondary estimation.
Keywords/Search Tags:Bayesian network, Causality, Intervention learning, Causal inference
PDF Full Text Request
Related items