| In the information age,data has become the country’s basic strategic resource,and additional,database is used as data storage and calculation for more than half a century,has become the fastest growing data processing tool.With the widespread use of database technology in all areas of life,people are beginning to pay attention to the stability of database service operation.If the database is abnormal,the core link of data will be affected,resulting in all services being impacted and huge economic losses.At present,people mainly use database monitoring as a detection method for database incidents,and mainly depends on manual management operations to repair incidents.Meanwhile,the rapid development of machine learning has brought new ideas to the database areas.In the context of big data,which scenarios are suitable for machine learning algorithms to solve database problems,how to predicate database incidents,and how to analyze massive SQL and build performance issue models are the key points in this paper for database incident prediction.The research on database incident prediction is mainly based on database monitoring prediction and SQL performance issue prediction.The research object is MySQL,and the experimental data are all based on JJWorld(Beijing)Network Technology Co.,Ltd.for practical analysis.The database monitoring prediction with machine learning is analyzed and demonstration based on combination of Random Forest model and ARIMA model.For the time series prediction,the ARIMA model improves the prediction indicator MAPE of the Random Forest model by 53%,and the short-term period(2 weeks)has improved prediction indicator MAPE more than 25%compared with the long-term period(6 months).Therefore,database incident prediction scenarios is suitable for Random Forest and the ARIMA model combined,which can greatly improve the prediction efficiency;The database SQL performance issue prediction is analyzed and verified.The research is based on plain text SQL slow log analysis.Through the combination of EM model and AdaBoost mode,it was found that the EM clustering model based on the GMM Gaussian mixture model had the best cluster analysis effect and cluster number was 4 will indicates useful SQL performance issue types,by comparing the 7 classifications model algorithm,research found that the AdaBoost algorithm was more suitable for evaluation of the SQL performance model in the comprehensive evaluation.Compared with the SVM algorithm,the accuracy is improved by 6%,and the execution efficiency is improved by 7 times.Finally,the prediction model was developed and implemented on the platform,and the expected results were obtained. |