| In recent years,big data and artificial intelligence have been widely used in various industries around the world,and are becoming the core technology engine leading the fourth technological revolution.With the development of data,computing power,algorithms,and models,PHM technology has been applied in high-speed railway train fault diagnosis prediction and state time series prediction.However,the PHM algorithm model based on empirical knowledge and physical principles is often not effective when faced with complex working conditions.PHM model algorithms based on artificial intelligence can handle high-dimensional features,but most of them belong to "black box models",lacking model interpretability,causality,and difficulty in visualization.Rail transit industry has strict safety requirements,such models are difficult to put into operation.In addition,the promotion of PHM platform also faces the requirement that data cannot be wrong,cannot be leaked,and cannot be slow.Diversified user requirements are also challenges for PHM platform promotion.This paper takes the traction system as the research object.Firstly,the feature engineering is carried out by the method of feature contribution and causal inference,so as to improve the accuracy,interpretability,causality and visualization ability of the model.Then,the effectiveness of the proposed method for fault diagnosis early warning and state time series prediction is verified by real data.Finally,the field application practice of this method is carried out through the PHM platform.The main work of this paper is as follows:(1)Introducing feature contribution method in statistics for feature engineering.To deal with the challenges of interpretability,visualization and feature contribution direction of feature engineering,the feature contribution method in statistics is introduced for feature engineering.To meet the challenges of interpretability,visualization and directionality of feature contribution in feature engineering,the adjusted feature contribution method in the field of statistics is introduced for feature engineering.Based on the information gain index and Pearson correlation coefficient,public data sets and real data sets is used to verify that the feature contribution method is effective for discrete data and continuous data.(2)Introducing causal inference method economics for feature engineeringTo deal with the causality challenge of feature engineering,the adjusted causal inference method in economics is introduced for feature engineering.The iterative verification of causal hypothesis is carried out through the steps of causal graph construction,causal identification,causal estimation and causal refutation.Based on the placebo method and the random common cause method,the causal inference method is validated by the real data set.(3)Propose a model for fault diagnosis and early warning by introducing time series featuresThe field real data of high-speed train traction converter communication board fault is used as data set.Based on the feature engineering,the features with high contribution and causality are selected.The HISM-FCCI method is proposed by adding time series data as a new feature dimension to the dataset.The ablation experiments with models such as classic random forest,gradient boosting decision tree,and naive Bayesian were carried out.The results show that the method has an average improvement of more than10 % in early warning accuracy,precision,recall,F score,and a 35 % improvement in computational efficiency.The method can improve the robustness of the model.And also provide causality diagram verification for product expert while making product optimization decisions.(4)Propose a model that introduces known future features for state time series predictionBased on the feature contribution,a long short-term memory network(KFF-LSTM)state time series prediction model based on known future features is proposed.The time series of the future time of some features is calculated by big data deduction and introduced into the model hidden variable adjustment mechanism.Adjust the output gates,state update units,and hidden variables of the classic LSTM model architecture to optimize the prediction accuracy and time gaps of the model.Ablation experiments were performed on KFF-LSTM,RNN,GRU and LSTM prediction models.The results show that compared with other methods,the MAE of the 16-step prediction in the test set is reduced by 18.0%,the RMSE is reduced by 7.9%,and the number of partial lag steps is better than other methods by 40.0%.It can be applied to long-term data prediction,and feature engineering is interpretable in the process.(5)Based on the PHM platform,the method of this paper was applied on-siteThis paper uses distributed technology and big data technology to build PHM system architecture.The challenge of "no error,no leakage,and no slowness" in data storage,retrieval,subscription,and transmission can be met.Through horizontal expansion or reduction technology,ensure efficiency when the demand for computing power is high,and save costs when the demand for computing power is low.The PHM platform can provide systematic ’ data + algorithm = service ’ capabilities in the face of differentiated needs of multiple users.Through on-site application cases,the social value and economic value of this method applied to the PHM platform are proved.The research results of this paper enrich the PHM feature engineering technology of high-speed railway train traction system.In the process of feature engineering,data experts and product experts can contribute knowledge in their respective fields,so as to update their models and cognition through iteration.This provides theoretical basis and decision support for fault diagnosis early warning,state time series prediction and product optimization. |