Font Size: a A A

Integrative One-class SVM For Multi-source Anomaly Detection

Posted on:2022-07-02Degree:MasterType:Thesis
Country:ChinaCandidate:Z Y ChenFull Text:PDF
GTID:2568306323470814Subject:Statistics
Abstract/Summary:PDF Full Text Request
As a classical method for unsupervised anomaly detection,one-class SVM has been studied on a single dataset in many literatures.In the age of big data,there are many data sources.For the problem of anomaly detection of multi-dataset,the accuracy of modeling directly merging all datasets is low,while modeling for each dataset separately has a large number of parameters and a high maintenance cost.Moreover,the similarity of variables in datasets from different sources is not taken into account.Therefore,it is of practical significance to find out the datasets of similar abnormal patterns and cluster them,establish a common model for the similar dataset,share the same model parameters and ensure the accuracy of estimation while reducing the maintenance cost.Currently,there is no literature on this issue.In this paper,the idea of penalty integration analysis is applied to anomaly detection.By pairwise punishment for the difference of model coefficients of different datasets,the integrative one-class SVM anomaly detection model is proposed,and ADMM algorithm is used for optimization solution.The simulation experiments show that when datasets are homogeneous or partially homogeneous,the proposed method can not only accurately cluster datasets,but also improve the model prediction effects comparing with the two cases of directly merging the datasets and modeling each dataset separately,and the number of parameters are also reduced compared with building a model for each dataset,and therefore the maintenance costs are reduced.On the basis of the above integrative one-class SVM,the sparse integrative oneclass SVM was proposed considering that variables from different sources may have unimportant variables,which may lead to overfitting on the training set and thus reduce the generalization effect of the model.Simulation results show that sparse integrative one-class SVM is better in F1 score than that without variable selection.Finally,the method in this paper is applied to anomaly detection of bank website logs.The empirical results show that the sparse integrative one-class SVM under SCAD penalty has best performance,which is superior to the effects of modeling all datasets together and modeling each dataset separately.
Keywords/Search Tags:anormaly detection, one-class SVM, multi-source dataset, variable selection
PDF Full Text Request
Related items