| The reliability of data center has always been a hot issue in academia and industry.The failure of storage system will seriously affect the high availability of services.As the main storage device,the sudden failure of hard disk may lead to the permanent loss of key data,causing huge losses to users and cloud service manufacturers.In recent years,researchers mainly use machine learning technology to predict hard disk failure,in order to reduce the cost of operation and maintenance when failure occurs.However,due to the large differences in the quality of S.M.A.R.T.data in different data centers,no one model can adapt to all environments well,so the robustness of the fault prediction system in the face of different complex scenarios needs to be improved.In response to the above problems,a model fusion-based hard disk failure prediction method MFDFP(Model Fusion Disk Failure Prediction)is proposed,which includes three parts: feature selection,model fusion and remaining useful life prediction.In the feature selection part,a feature selection method based on KL divergence(Kullback Leibler divergence)and recursive feature elimination is used to accurately screen out the features that are strongly related to hard disk failures,improve the accuracy of failure prediction and enhance the interpretability.In the model training part,the popular gradient boosting tree algorithm is selected as the training model,and the bagging method is used for fusion.The greedy strategy is used to determine the optimal weight of each model,so as to improve the robustness of the prediction system in different complex environments and obtain better prediction performance.In the remaining useful life prediction part,aiming at the problem that the traditional modeling method is too simple,the data set labels are divided in a more fine-grained way,and the remaining useful life of the hard disk is modeled and analyzed by regression analysis.Through the analysis of the prediction curve,the key time point of the deterioration of the hard disk state is obtained,and the timely response to the failure is realized.The experimental results show that compared with the random forest model,the Fmeasure is increased by 9.2%~11.4%,the misclassification cost is reduced by42.1%~45.4%,the storage and migration cost is reduced by 16.2%~22.7%.And compared with CNN-LSTM model,F-measure is increased by 11.2%-38.7%,the misclassification cost is reduced by 47.6% ~59.7%,the storage and migration cost is reduced by 16.5%~34.9%.Compared with rank sum test and Pearson coefficient,MFDFP improves failure detection rate by 7.3% ~ 11.7% under the same false alarm rate. |