Research On Hard Disk Failure Prediction Method Based On Model Fusion

Posted on:2022-09-10

Degree:Master

Type:Thesis

Country:China

Candidate:Y Chen

Full Text:PDF

GTID:2518306572490844

Subject:Computer system architecture

Abstract/Summary:

PDF Full Text Request

The reliability of data center has always been a hot issue in academia and industry.The failure of storage system will seriously affect the high availability of services.As the main storage device,the sudden failure of hard disk may lead to the permanent loss of key data,causing huge losses to users and cloud service manufacturers.In recent years,researchers mainly use machine learning technology to predict hard disk failure,in order to reduce the cost of operation and maintenance when failure occurs.However,due to the large differences in the quality of S.M.A.R.T.data in different data centers,no one model can adapt to all environments well,so the robustness of the fault prediction system in the face of different complex scenarios needs to be improved.In response to the above problems,a model fusion-based hard disk failure prediction method MFDFP(Model Fusion Disk Failure Prediction)is proposed,which includes three parts: feature selection,model fusion and remaining useful life prediction.In the feature selection part,a feature selection method based on KL divergence(Kullback Leibler divergence)and recursive feature elimination is used to accurately screen out the features that are strongly related to hard disk failures,improve the accuracy of failure prediction and enhance the interpretability.In the model training part,the popular gradient boosting tree algorithm is selected as the training model,and the bagging method is used for fusion.The greedy strategy is used to determine the optimal weight of each model,so as to improve the robustness of the prediction system in different complex environments and obtain better prediction performance.In the remaining useful life prediction part,aiming at the problem that the traditional modeling method is too simple,the data set labels are divided in a more fine-grained way,and the remaining useful life of the hard disk is modeled and analyzed by regression analysis.Through the analysis of the prediction curve,the key time point of the deterioration of the hard disk state is obtained,and the timely response to the failure is realized.The experimental results show that compared with the random forest model,the Fmeasure is increased by 9.2%～11.4%,the misclassification cost is reduced by42.1%～45.4%,the storage and migration cost is reduced by 16.2%～22.7%.And compared with CNN-LSTM model,F-measure is increased by 11.2%-38.7%,the misclassification cost is reduced by 47.6% ～59.7%,the storage and migration cost is reduced by 16.5%～34.9%.Compared with rank sum test and Pearson coefficient,MFDFP improves failure detection rate by 7.3% ～ 11.7% under the same false alarm rate.

Keywords/Search Tags:

Failure prediction, Model fusion, S.M.A.R.T., Robustness

PDF Full Text Request

Related items

1	Research On Data Center Disk Failure Prediction Model
2	Modeling And Implementation Of Equipment Sudden Large Failure Prediction Based On Onâ€line Monitoring Data
3	Analysis Of Internet-facing Cascarding Failure And Research To Enhance The Robustness
4	Research And Implementation Of Disk Failure Prediction Model Design And Generation System
5	Based On The Offline Time Series Data Of Sudden Failure Prediction
6	Research On Failure Analysis,Modeling And Prediction For Supercomputers
7	Research On Failure Mechanism And Life Prediction Methods For Vertical Double-diffused Power MOSFETs
8	Research On Intelligent Failure Prediction And Analysis Technology Of Optical Network System
9	Research On Transmission Robustness And Optimization In Heterogeneous Networks Based On Multipath TCP
10	Research On Key Technologies Of Failure Prediction Based On Machine Learning Method For Exascale System