Font Size: a A A

The Research And Design Of Dynamic Data Cleansing Based On Java Rule Engine

Posted on:2009-01-11Degree:MasterType:Thesis
Country:ChinaCandidate:Y L CaoFull Text:PDF
GTID:2178360245454988Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
In course of operation management, companies have accumulated a mass of vital electronic data. Decision-makers become increasingly dependent on the above-mentioned data while carrying through analysis and strategies as wrong or conflicting data will likely result in unsuccessful maneuver, which in return can breed disastrous loss. Hence it is essential that data should be processed before entering into decision-making system in view of improving its credibility and availability.To ravel out the above-mentioned problem, experts have put forward a solution called data cleansing which refers to inspecting "dirty data" from massive data according to certain rule (domanial knowledge) and to repairing or discarding it in the light of some rule (cleansing action rule) .Traditional tools for data cleansing have the following insufficiency: it is inefficient in practice in that modification and recompilation are required for the generation of cleansing software, the reason of which is that logic for inspecting and repairing "dirty data" is embedded into code or relies on agile but inefficient manual judgment. It is no other than the appearance of Java Rule Engine that provide feasible technological foundation for people to find such a data cleansing mode that based on dynamic and configurable rules.The thesis presented the basic principles of Rule Engine and investigated the working mechanism of Java Rule Engine and its core algorithm—Rete algorithm. The thesis also introduced a kind of open-source Java Rule Engine software package—Drools and systematically investigated its API usage, the structure and meanings of its rule configuration file.The thesis mainly elaborated on the design scheme of dynamic data cleansing system based on Drools rule engine and investigated the BNF (Backus-Naur Form) definition of domanial knowledge and cleansing rules laying a solid foundation for the persistence of rules.The thesis presented the design and implementation of a kind of dynamic cleansing system, which adopts Drools Rule Engine to describe and execute cleansing logic and can deal with many kinds of problems related to data quality. This system remedied the defect that existing data cleansing tools have. The main reasons for achieving this are persistent storage of cleansing rules and dynamic update of Drools rule configuration file. The thesis detailedly presented rule database design, division of functional module, architecture, working flow of the system, some code segments of main modules. The thesis also presented the result of experimental performance's analysis.
Keywords/Search Tags:Rule Engine, Dynamic Data Cleansing, Drools, Data Transformation
PDF Full Text Request
Related items