Property Analysis-based Local Outlier Mining Algorithm And Its Application

Posted on:2012-01-17

Degree:Master

Type:Thesis

Country:China

Candidate:L Wang

Full Text:PDF

GTID:2208330335980078

Subject:Computer application technology

Abstract/Summary:

PDF Full Text Request

With the development of information Technology, large amounts of data have been stored in the database, this data generally has the properties of high dimensional, large data quantity and distribution sparse, which is a large challenge for the outlier mining algorithm. Most of traditional outlier mining methods identify outliers from a global point of view, which is inappropriate for high dimensional and large data sets. In this paper, an outlier mining algorithm based on the subspace is presented for local outliers in low dimensional subspace by adopting attribute relevance analysis. The main research work can be shown as follow:(1) An outlier mining algorithm is presented by taking attribute relevance analysis. Firstly, the irrelevant attributes, which are dimensions constituted from dense regions of data set, are removed from the date set by using the attribute relevance analysis, so that the data set and dimensions can be reduced effectively, and the outlier mining efficiency is improved. Secondly, sparse subspaces are searched by using particle swarm optimization based on sparsity coefficient threshold, and local outliers are identified in the sparse subspaces. In the end, experimental results validate the correctness and effectiveness of the algorithm by adopting the star spectrum data set.(2) An outlier parallel mining algorithm is presented by taking attribute relevance analysis. Firstly, main node distributes attribute relevance analysis task, then each sub-node finds out irrelevant attributes of data set in parallel, and these attributes are returned to the main node. The irrelevant attributes are removed by the main node. Secondly, the main node assigns search task, and each sub-node takes particle swarm optimization algorithm to search local outlier spaces in parallel .The main node works out the outlier spaces to establish the global outliers. In the end, the experimental results validate accuracy and effectiveness of the algorithm by using star spectrum data set in parallel computing environment. (3) On the basis of above, the outliers mining system for star spectra data based on attribute relevance analysis are designed and realized by using VC++6.0 and Oracle 9i as development tools. The experimental results show that the outliers mining by the system are feasible and valuable for mining star spectra outliers.

Keywords/Search Tags:

Local outlier, Attribute relevance analysis, Particle swarm optimization, Subspace, Sparsity coefficient, Parallel computation, Irrelevant attributes

PDF Full Text Request

Related items

1	Research On Outlier Mining Algorithms Based On Subspace And Its Application
2	Research On Local Outlier Detection Technology
3	A Parallel Platform For Swarm Intelligent Algorithm's High Performance Computing
4	Study Of Parallel Particle Swarm Optimization
5	Research On Attribute Reduction Method Based On Particle Swarm Optimization
6	Perturbation Particle Swarm Optimization Algorithm Based On Local Far-neighbor Differential Enhancement
7	The Study Of Particle Swarm Optimization
8	Studies On Individual Particle Swarm Optimization
9	Study Of Attribute Reduction Algorithm Based On Genetic & Particle Swarm Optimization Algorithm And Rough Sets
10	Research And Application Of Particle Swarm Optimization On TSP