Font Size: a A A

Inductive Logic Programming For Data Mining

Posted on:2007-11-03Degree:MasterType:Thesis
Country:ChinaCandidate:L L ZhaoFull Text:PDF
GTID:2178360182996422Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
In the world data has increased along with rapid development of databasetechnology and application of data base management system (DBMS). As tomining the knowledge hiding in databases, it is not efficient enough onlydepending on the query mechanism of current DBMS and statistical methods.So it becomes a pressing need and challenging task to find out valuableknowledge for decision-making in databases. Data mining (DM) is a new fieldto satisfy this need. Data mining aims to getting the hidden information andknowledge which is potentially of use, from a large amount of imperfect andstochastic data. it uses multiple techniques, which contain machine learning,fuzzy logic, pattern recognition, neural network, genetic algorithm, statistics,etc.Inductive logic programming (ILP) is a research area at the intersection ofmachine learning (ML) and logic programming (LP). It introduces the idea ofinduce to ripe logic programming theory and technique, overcome the problemof traditional machine learning. It is more expressive and available. As machinelearning is used in DM, ILP is able to be used in DM naturally.Our group cooperates with another group of chemistry college to develop achemistry database and corresponding database management system andanalysis system. The analysis of database is based on DM. In this background,this thesis develops a simple inductive logic programming system and itsinterface to the chemistry database. The system work as an analysis method,cooperating with other several methods.Our ILP system performs a top-down search on the refinement graphs, anduses an A*-like algorithm search the refinement graphs from scratchheuristically. For efficiency, the system uses the pruning algorithm to cut downthe search space. Also, it uses a noisy-handling mechanism for dealing with theproblem in domains with imperfect data. In the implementation we make somerestrictions on bias (mode declaration) considering efficiency and feasibility ofthe system, By doing so we weaken the power of the system.For the application of the ILP system to DM, this thesis studies theknowledge of both inductive logic programming and database mining,summarizes the relationship between relation database and ILP system, andanalyses three methods coupling ILP with the relation database. And in thethesis we adopt the first method. Via the mapping from relation attribute inrelational database to predicate in ILP, it implements a loosing-couplinginterface between DBMS and ILP system. That is to say, the part of databasecorresponding to the training set, background knowledge, type declaration,mode declaration and etc. , is transformed to the form of Horn clause, which isimported to the ILP system. The interface is implemented with the C++language and ODBC (Open DataBase Connectivity) in VC++6.0. In this way,our system is independence of DBMS, and as a ILP system, it can learnknowledge from multiple tables.There still remains some work to improve, such as introducing theconstraint mechanism into the ILP system, developping a more friendly userinterface, and use more efficient close-coupling to combine ILP system andDBMS.
Keywords/Search Tags:Programming
PDF Full Text Request
Related items