Font Size: a A A

Research On Fault Prediction Method Of Large-Scale Hard Disk Based On Machine Learning

Posted on:2022-08-22Degree:MasterType:Thesis
Country:ChinaCandidate:T T ChangFull Text:PDF
GTID:2568306488979359Subject:Engineering
Abstract/Summary:PDF Full Text Request
With the advent of the big data era,the total amount of global data has shown an explosive growth trend,and more and more industrial-level institutions rely on data centers to store and process data.The collapse of the data center may bring huge losses and even lead to catastrophic consequences.According to statistics,hard disks are one of the largest sources of failures in data centers,and only one failure of hard disks accounts for 71.1% of all hardware failures in data centers.Therefore,it is very necessary to accurately predict hard disk failures.In order to accurately predict hard disk failures,according to the current research situation,this paper has conducted the following research:Aiming at the problem that the prediction accuracy and misjudgment rate of the hard disk failure prediction model increase in proportion,this paper proposes a random forest prediction model based on variable weights.First,calculate the correlation between the characteristic attributes of the hard disk and whether there is a failure,complete the dimensionality reduction processing of the original data set,and propose the sum of the split information value and the average value of the split information to replace the single split information value.Then,a better decision tree is selected according to the accuracy and diversity value,and its weight is dynamically assigned to form a strong classifier random forest model.In order to better apply it to actual production and life,and to migrate important data stored in the failed hard disk in time,the evaluation index of average early warning time is introduced.Finally,the experiment is carried out and the results are obtained.The prediction accuracy rate obtained by the method in this paper reaches 93.12%,the fault prediction misjudgment rate is reduced to 0.008%,and the early warning time is 15 days.It shows that compared with the existing prediction models,the model proposed in this paper reduces the misjudgment rate to the greatest extent under the premise of ensuring a certain fault prediction accuracy rate.Aiming at the problems of over-fitting and bias-fitting caused by existing hard disk health prediction models on highly imbalanced data sets,this paper proposes a hard disk health prediction model based on GA-LPAT neural network.First,select the SMART attribute value of the failed hard disk and add it to the initial population of the genetic algorithm(GA)to reproduce the offspring according to the fitness function.Then,layer-by-layer perturbation is added to any layer of LSTM for adversarial training operations,and a layerwise perturbation-based adversarial training method is constructed to train the hard disk state prediction model.Finally,the experiment is carried out and the results are obtained.The method of this paper can detect that the hard disk is about to fail 25 days in advance,and the accuracy of the health level classification of the healthy hard disk and the failed hard disk is increased to 82.91% and 53.16% respectively.It shows that,compared with the existing prediction models,the model proposed in this article better solves the problems of over-fitting and paranoid fitting,and obtains a higher accuracy rate of hard disk health classification.
Keywords/Search Tags:SMART, Feature attribute extraction, Dynamic weight allocation, GA(Genetic algorithm), Adversarial training
PDF Full Text Request
Related items