Font Size: a A A

Research On The Event Causality Of The Multiple Type Data

Posted on:2015-07-25Degree:DoctorType:Dissertation
Country:ChinaCandidate:Y LiFull Text:PDF
GTID:1315330536467149Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
Causality research aims to reveal the law of evolution of nature,society and human.It can be used to explain phenomena,control objects,and predict the future.Event in causality research is a thing that is happening in some particular situation and may be a cause or effect of other things.Event causality research based on numerical data and news text takes advantage of computer science and technology to extract causal relationships of events from many domains,such as medicine,economy,politics,military affairs,environment and scientific research by analyzing and processing the numerical data and news texts firstly.When the events causality network was obtained,we can analyze the situation and forecast the trend of hot events.So the event causality research is an interdisciplinary research of current affairs analysis and computational science including probability and statistics,text mining,natural language processing and so on.Nowadays,the causality research receives widespread attention for its important applications of human life and scientific research,but there are still many difficulties and challenges.Some insufficient even faults still exist that people often misunderstand causality.Another problem is how to handle the tough and complicated factors of real environment efficiently.Many algorithms for causality discovery and inference still suffer sensitivity about the domain of application,as well as high computational complexities.At the same time,causality research needs more effective and creditable datasets and criterion for evaluation.Motivated by these problems,we researched causality systematically,including causality essence description,properties and types of causality,physical mechanism,elements,lifecycle and exogenous driving of causality process,causality discovery methods and prediction method using causality.Following these views on causality and Pearl's theories and classical methods,we present the framework ICIC_Framework and methods ICIC_Discovery,ICIC_Analysis and ICIC_Prediction for event causality research based on numerical data for higher performance and stronger robustness as well as a comprehensive evaluation method.Our method ICIC_Discovery can infer the global causal network or local network of a target variable from observational data(including discrete data,continuous data and their mixed data)without any limitation of some assumptions,such as assumption of acyclic structure.We use exogenous variable and clique-like structure(IClique)to get rough ordering of variables which are necessary for revealing causalities and their directions accurately.To evaluate our approach,experiments compared with HITON,IC,PC,Apriori and several methods based on six datasets with different data types have been done.The results demonstrate the higher performance and stronger robustness.We also discuss the advantages of reliability,stability and complexity of ICIC_Discovery.Then we use ICIC_Analysis to analyze many properties of global or local causal network discovered by ICIC_Discovery,such as the unified model of stimulating and inhibiting event causalities,cooperation or competition between causes,?evolving? models of each sample,and the network with hidden variables,etc.These properties which are corresponding to physical mechanisms of causality process can help to explain or predict event causality more reasonably.The ICIC_Prediction presented in this paper avoids the limitation of traditional prediction methods that usually use Markov Blanket(MB)of target variable to predict the value of target variable(i.e.target event).It has been often observed during experiments that the value of target variable may be influenced by the behaviors of all variables in one or more samples no matter where these variables locate.Therefore,we propose a new prediction method ICIC_Prediction,which treats all variables of a dataset as a whole system and use global and local properties of this system to predict the value.Event causality research based on news text sequences we proposed is an interdisciplinary area combining causality research,data mining,politics or policy analysis,etc.Event causality has many new characters relative to those traditional researches that we have to face more challenges which come from the stronger requirement of real-time computation and the extremely complicate environments in the real world,because the research involves fields so widely that knowledge needed may explode exceeding the insights and capabilities of experts.The worse is even if there is a rule that should be followed,random factor can make the trend or trace of event more complicated.So computer science and technology is appropriate to this research.The method ICIC_Prediction_News Event we proposed extracts event structure(including date,location,person or organization,keywords and other important information of event)from news text using natural language processing tools,and digitalizes event structure into three classes of numerical sequences(i.e.classes of location,person or organization,and keyword sequences)following temporal order of events.These sequences are usually sparse with a large number of zero positions that we may have a ?probability failure?.To solve this problem,all news events are clustered in groups according to synonyms.When extracted causal relationships of event classes from event sequences.we filter irrelevant events according to the information of date and location and sort relevant news events by their importance and intensity of relation.Event causality prediction,which bases on analysis of causal structure,history and current situation,aims to predict appearance of a specific event in the future.Event causality prediction can provide more credible and objective evidences to assist people in decision-making and policy-making or manipulating.Experiments based on news events lists in Wikipedia show that ICIC__Prediction_News Event have the ability to predict news events automatically.Validation of causality discovery and prediction methods and their results in the real-life environment is a very difficult and urgent problem.There are some evaluation indices and data set for test causal structure and prediction,but most of these data sets,which are simulation data from laboratory,are strongly relevant to a specific field.Although they are very expensive,these data sets are limited in the new field.To solve these problems,this paper puts forward an evaluation method with theory analysis and comparison experiments.This paper demonstrates that ICIC_Discovery is reliable and stable when there is small error from input.What's more,this evaluation method contains experimental proof and analysis of performance,so we have done experiments comparing IC,GC and several other popular methods with method ICIC_Discovery,ICIC_Analysis and ICIC_Prediction based on many published and popular datasets to show the advantages of our methods.And the causal relationships of news text from Wikipedia annotated by volunteers' manual work have been used to test ICIC_Predition_News Event method.Our evaluation methed offers a more comprehensive way for causality evaluation.
Keywords/Search Tags:Causality, Causality Discovery, Causality Network, Stimulating Causality, Inhibiting Causality, News Text, Event Causality, Event Cluster, Hidden Variable, Burst Interval
PDF Full Text Request
Related items