Font Size: a A A

Research On Clustering Outlier Detection In The Audit Field

Posted on:2012-07-13Degree:MasterType:Thesis
Country:ChinaCandidate:Y N TanFull Text:PDF
GTID:2219330368481944Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
Outlier detection is known as the small event detection, error detection. In some application, small probability events are more interesting and contain more research value often than regular events. In essence, the data clustering algorithm is to group the data the data set, making data within the same group as similar as possible and within the different group as different as possible. In some of the previous clustering algorithms, outlier detection is the product of a clustering process, so important information is loset; Outlier mining, only focus on outlier detection, not about the distribution of the data making outliers analysis more difficult, and leading to lose practical value. If we combine the outlier mining algorithm and the clustering analysis, a more accurate understanding of the data distribution can be achieved.The way of audit decided the quality of audit results and the Dynamic Monitoring indicators also can decide the effectiveness of the audit. Traditional auditing methods are often constructed by the expert's experience and policies and regulations, which contains many deficiencies.It was significant theoretically and practically that the data which was mined from the mass of the audit data based on the data mining technology. These data can provide decision-making for the construction of audit methods and refining dynamic monitoring of indicator.This thesis presented a DBSCAN_LOF algorithm.It integrated the determination of DBSCAN core object and the idea of direct density-reachable together., it also re-define the concept of core object, increasing the concept of the radius of k-distance neighborhood.The algorithm not only combined the clustering algorithm concept of k-nearest neighbor and outliers'algorithm together, breaking the fact that traditional clustering-based outlier detection algorithm for outlier detection results were affected the clustering results and reduced the impact on the clustering results of DBSCAN because of parameter sensitivity and uneven space distribution of data, and detect outliers while rapidly clustering.On the base of such data sets, this thesis compared the differences between DBSCAN_LOF algorithm and other original clustering algorithm including the effectiveness and effects. Then taking the social security audit data as the experimental data, it pre-processed the audit data which are complex-data-type data and numerical data with different meanings. This thesis also used DBSCAN_LOF to do the experiment verification and realized the data mining that can provide the decision-making capabilities for the audit methods constructing.
Keywords/Search Tags:Clustering, DBSCAN, LOF, Outlier mining, Audit method
PDF Full Text Request
Related items