Font Size: a A A

Research On Hard Disk Failure Prediction Technology Based On Broad Learning System

Posted on:2023-10-21Degree:MasterType:Thesis
Country:ChinaCandidate:L X PengFull Text:PDF
GTID:2568306830452534Subject:Computer technology
Abstract/Summary:PDF Full Text Request
With the growth of global user storage requirements,the organizational structure of cloud data center storage systems has become more and more complex,and the cross-mixing of various storage devices has become the norm.As the most important storage device,hard disks directly affect the performance and security of the storage system.In recent years,active faulttolerant schemes that use machine learning algorithms to predict hard disk failures based on the SMART attributes of hard disks have gradually become mainstream.However,existing methods still face the following challenges: First,for the imbalance of positive and negative samples,the commonly used resampling techniques do not consider the actual health status of the hard disk,resulting in poor sample quality after resampling;Second,in the large-scale cloud data center scenario,the modeling method with high training time cannot quickly adapt to the incoming new data,which leads to the problem of easy aging of the model;Third,the largescale cloud data center hard disks are heterogeneous,and the homogeneous hard disk failure prediction model without efficient modeling characteristics is difficult to generalize,and the failure of different types of hard disks can only be predicted by retraining the entire model.In view of the above problems,the main research work and innovation points of this paper are as follows:1、Aiming at the first challenge,a resampling strategy based on the Affinity Propagation algorithm is proposed,which fully considers the actual health status of the hard disk.This strategy completes the oversampling of positive samples through the labeling strategy of faulty disk samples,and completes the undersampling of negative samples through the screening strategy of healthy disk samples,so as to solve the imbalance problem of positive and negative samples.On the public dataset,compared with another method for resampling,the model obtains better prediction performance on the sample subset obtained by the resampling strategy,which verifies the effectiveness of the resampling strategy.2、Aiming at the second challenge,a homogeneous hard disk failure prediction model is proposed,which integrates affinity propagation clustering algorithm and broad learning system.On the basis of solving the problem of unbalanced positive and negative samples,the efficient modeling characteristics of the broad learning system are used to quickly build a failure prediction model,and when new data arrives,the output layer weights are updated through incremental learning to overcome the problem of easy aging of the model.The effectiveness of the predictive model is demonstrated by comparison experiments with other models on different sample subsets obtained after resampling the original imbalanced dataset.3、Aiming at the third challenge,a heterogeneous hard disk failure prediction model MRBLS is proposed,which based on manifold regularization and broad learning system.According to the manifold regularization framework,the Laplacian matrix of the input data is constructed to approximate the manifold structure of the whole heterogeneous sample,which further constrains the solution of the optimization problem of the original broad learning system and improves its cross-domain learning ability,to solve the problem of failure prediction of heterogeneous hard disks.On two public datasets,experiments were carried out with the SMART data of two heterogeneous hard disks as the input data of the model,which proved the effectiveness of the prediction model.
Keywords/Search Tags:broad learning system, hard disk failure prediction, sample imbalance
PDF Full Text Request
Related items