| Fault diagnosis methods are important to guarantee the safe and smooth operation of rotating machinery,reduce the losses caused by faults,facilitate the maintenance of machinery,and guarantee the safety of personnel.Fault diagnosis is an emerging field in industrial scenarios,and how to design an effective solution for fault identification by artificial intelligence(AI)is getting more and more attention from industry and academia.However,there are many problems in actual industrial scenarios: 1)Since adding labels to monitoring data depends on expert knowledge in a specific field and requires a lot of manpower,due to conditions or cost considerations,monitoring data generally do not have corresponding labels.2)The initiation and expansion of rotating machinery faults are very fast,and they will be shut down for maintenance immediately after a fault occurs.Therefore,most of the collected monitoring data are normal state data,and few data are collected under fault states.To break through the limitations of the above problems on rotating machinery fault diagnosis,it is necessary to study high-quality feature extraction methods that do not rely on data labels and large amounts of fault data.The recently emerging self-supervised contrastive learning method extracts the essential features of a piece of monitoring data that are most different from other data by comparing the similarity between monitoring data,which is expected to solve the problem of unlabeled data and few samples of fault data in the field of fault diagnosis.This work aims at developing fault diagnosis methods based on self-supervised contrastive learning for bearings and gears which are key vulnerable components of rotating machinery.Aiming at the problem that the monitoring data of rotating machinery in actual industrial scenarios generally do not have corresponding labels,we first propose a feature extraction method based on the contrastive coding model(CPC).The contrastive coding model is a selfsupervised representation learning algorithm that can be used to extract representative data features from a large number of unlabeled vibration signals.And we use T-SNE dimensionality reduction and clustering to verify the effect of comparing encoders to extracting data features.Based on unlabeled,we add more stringent conditions: extremely unbalanced data sets(a large amount of normal data,a very small amount of fault data).Facing these two conditions,we propose a diagnostic framework SRLD based on Siamese representation learning to deal with the problem of fault diagnosis based on extremely imbalanced data,and it can detect unknown states.First,we use self-supervised contrastive Siamese representation learning to extract high-quality features from 1D monitoring signals,based on which fault diagnosis can be performed directly in the feature space with extremely imbalanced data.Whereas unknown state detection is achieved by judging whether a newly collected data sample is outside the estimated range of each known class.The proposed SRLD is validated with two datasets,the quality of features extracted by SRLD is evaluated qualitatively and quantitatively,and the diagnostic performance is compared with other popular methods.The results show that SRLD can provide representative features,accurately diagnose faults and detect unknown states in the case of an extreme lack of fault data.The two fault diagnosis methods proposed in this paper are experimentally verified using the benchmark data set of bearing fault diagnosis of Case Western Reserve University in the United States.To study the unknown state detection of the SRLD framework,this paper uses the bearing parallel shaft gearbox fault simulation test bench to set up the faults of bearings and gears,and the combined faults of bearings and gears to collect vibration signals. |