| With the fast development of sensor technology and signal acquisition instruments,the condition monitoring and intelligent operation and maintenance of rolling bearings have been brought into the "data driven era".Rich and comprehensive status data provide decision-making convenience for condition monitoring and intelligent operation and maintenance of rolling bearings.In the same time,the new challenge of how to ensure data quality has also attracted widespread attention.The reliability and integrity of rolling bearing data are prerequisites for condition monitoring,fault diagnosis,and life prediction.Therefore,it is necessary to monitor bearing signal data to ensure data quality.This article attempts to take the detection of abnormal data in bearing condition monitoring as a complete research task.First,it is based on finding and defining various types of abnormal data in bearing data and exploring the mechanism of various abnormalities through experiments.Then,it attempts to find suitable and accurate abnormal data detection methods for bearings through semisupervised and unsupervised abnormal data detection algorithms to realize the detection and positioning of abnormal data points.The main research content of this article is summarized as follows:(1)The definition and classification of abnormal data for rolling bearings are given,and the possibility and mechanism of various abnormalities are verified through experimental simulation.At the same time,according to the representation of abnormal data in the overall data,a dataset containing tags is constructed,which provides a basis for the subsequent performance evaluation of the proposed algorithm.According to the characteristics of data,a progressive feature selection method based on REF(Recursive Feature Elimination)and random forest is proposed,which can adaptively eliminate redundant features and achieve effective representation of data.(2)Based on the semi supervised idea,an anomaly detection method based on feature selection and Bayesian hyperparametric optimization for support vector data description is proposed.Through feature selection,dimensionality reduction not only reduces the time cost of building a model,but also improves the detection accuracy and avoids overfitting of the model.At the same time,a Bayesian hyperparametric optimization method is proposed to jointly optimize the Gaussian kernel function and penalty parameter C in the SVDD model,reducing the inaccuracy of the model during training,and improving the accuracy of the model for detecting abnormal railway bearings.(3)Based on the unsupervised idea,an isolated forest anomaly detection method based on parameter optimization and feature dimensionality reduction was proposed.The PSO(Particle Swarm Optimization)optimization method is used to optimize the size of the sliding window during feature extraction from a global perspective based on the maximum difference between the abnormal and normal scores in the detection results.Optimization not only increases detection accuracy,but also reduces model complexity.This method can effectively avoid the misjudgment phenomenon caused by impure single class samples in semi supervised models.(4)Aiming at the difference in scores between global and local anomalies in the combined anomaly detection of isolated forest algorithms,it is easy to cause the local anomaly scores to be covered up.This paper proposes an anomaly detection method that combines LOF(Local Outlier Factor)and isolated forests,making full use of the advantages of the two anomaly detection methods,which not only alleviates the lack of local anomaly detection capabilities in isolated forests,but also effectively solves the task of rolling bearing combination anomaly data detection. |