| The Electric Internet of Things is a power system based on cyber-physical integration,newly integrated with uncertain social factors,forming a power cyber-physical-social integration system.It is an important strategic support for State Grid Corporation to build an internationally leading energy Internet enterprise with Chinese characteristics.However,the integration of cyber-physics-society will lead to a complex and diverse operating environment for the Electric Internet of Things,making it vulnerable to external risks in all aspects.Disturbances at any level may affect the safe operation of the Electric Internet of Things.In order to predict the various security risks faced by the Electric Internet of Things in time,so as to find out the weak links and improve them.From the perspective of data mining,this paper comprehensively considers the operational data on the cyber,physical and social sides.Through data fusion and balance processing,feature selection and construction of risk prediction model,a method for predicting the operation risk of Electric Internet of Things based on ensemble learning is proposed.The main research contents are as follows.(1)When studying the operation risk prediction of the Electric Internet of Things,it is necessary to comprehensively consider the operation data of the cyber,physical and social sides to construct a prediction model.However,there are relatively few risk samples in the actual running data,which will lead to insufficient learning of the subsequent model when training the data,resulting in a low prediction accuracy of the model.In response to the above problems,the risk data that affects the operation of the Electric Internet of Things is first defined.Then,the dataset is acquired through simulation experiments.And based on random matrix theory,the risk data from cyber,physical and social sides are integrated with time series as the benchmark,and a multi-dimensional risk data set is obtained.Then,based on the ADASYN method,the minority class samples in the fused training set are oversampled,so that the number of various samples tends to be balanced.The experimental results show that after data balance processing,the prediction accuracy of minority sample risk categories is significantly improved.(2)The risk sample set after data imbalance treatment still has the characteristics of large amount of data and high data dimension.There may be some redundant and irrelevant features among the many features,which will increase the time overhead of the subsequent learning model and affect the performance of the model.In response to the above problems,the Relief F-S algorithm is used to perform optimal feature selection on the sample set of operational risk balance in the Electric Internet of Things.When evaluating the sample redundancy,the contribution of features to the classification and the correlation between features are jointly considered,which can better eliminate redundant data.Experimental results show that this method can effectively reduce the data dimension,shorten the training and prediction time of the model,and improve the overall efficiency of the model.(3)Due to the characteristics of strong randomness,difficult to be found,and complex causes of risk,it is difficult to quickly investigate risks only by conventional manual inspection,probability statistics and electrical mechanism analysis.With the continuous maturity of real-time data acquisition capabilities,machine learning methods can be used to mine data information to quickly and accurately check risks.Therefore,in this paper,the Bayesian optimization algorithm is introduced to optimize the key parameters of Cat Boost,and the operation risk prediction model of the Electric Internet of Things based on BO-Cat Boost is constructed.Combined with data balance processing and feature selection method,a complete set of risk prediction scheme is finally given.The experimental results show that the risk prediction scheme can effectively solve the problems of low risk prediction accuracy and slow speed,and has strong applicability. |