| The operation and maintenance of electromechanical equipment through a series of methods to monitor the operation status of electromechanical equipment,and through various means to its maintenance,is the basic guarantee of electromechanical equipment to provide all kinds of services.The operation and maintenance of traditional electromechanical equipment is a complex system composed of specific equipment,which is highly dependent on manual operation and expert experience,so it is difficult to manage effectively.Artificial Intelligence for IT Operations(AIOps)platform is used to access the big data of electromechanical equipment,with distributed storage,parallel computing,machine learning and other methods for further preservation and processing,which improves efficiency and reduces costs.In addition,the original status data of the electromechanical equipment might have data missing,data redundancy,data error and other data problems,and data cleaning is needed to access the AIOps platform.At the same time,the basic functions of AIOps,such as fault analysis,rely on sufficient labeled samples,which also requires manpower cost.Aiming at the data preprocessing requirements and labeled data cost caused by the AIOps for electromechanical equipment,this paper mainly studies the structured data cleaning method and a fault prediction algorithm based on few labels.The main work is summarized as follows:(1)In order to meet the requirements of data processing for electromechanical equipment with high efficiency and low cost,the big data processing framework Spark is introduced;In order to improve the reliability of data access,processing and storage,an AIOps platform for electromechanical equipment is designed.The system is divided into three layers: data access service and data cleaning,fault analysis module,and analysis result feedback.Experiments show that the platform realizes the functions of fast access,reliable storage and efficient analysis of a large number of electromechanical equipment data.(2)A method for data cleaning of electromechanical equipment based on Isolation Forest(IF)was proposed to preprocess the original data of electromechanical equipment and execute abnormal detection and error correction for data errors.The experiment shows that the algorithm improves the data quality and benefits for the data analysis.(3)For the cost problem of data analysis which demand sufficient labeled data,a semisupervised learning algorithm IF-GBDT based on improved Isolation Forest and Gradient Boosting Decision Tree(GBDT)was proposed.Based on the learning results of samples with few labels,the unlabeled data was labeled by the improved Isolation Forest algorithm.Based on the new labeled dataset,the Gradient Boosting Decision Tree algorithm is used to train the model for fault prediction,so as to reduce the influence of the lack of label on the accuracy of the prediction model.The experimental results show that the proposed method can improve the classification accuracy,and has good adaptability for few labels and good parallel performance. |