Font Size: a A A

Research On Load Forecasting Of Power SCADA Based On Distributed Machine Learning Algorithms

Posted on:2017-04-20Degree:MasterType:Thesis
Country:ChinaCandidate:C JiangFull Text:PDF
GTID:2382330596957431Subject:Control Science and Engineering
Abstract/Summary:PDF Full Text Request
Power Supervisory Control and Data Acquisition System is an important guarantee for the safe and efficient operation of power system,it is quite important for the economic benefits related to the power generation control and power management of power system to predict the power load by using the load relevant data recorded by SCADA.With the development of SCADA,the system itself is becoming more and more complicated and networked,the data represent the historical state,which is composed of running state data of the current system,not only expends gradually in quality,but also in dimension.In addition,the gradual enrichment load related information,such as temperature and holiday information,are also integrated into SCADA gradually,all the data integrated are creating conditions for more accurate load forecasting and attracting more and more methods and models applied in practice in the field of load forecasting.In order to improve the accuracy and speed of load forecasting,based on the clustering transformation of the physical structure of SCADA data analysis layer,the load forecasting of power system is studied by combing Spark distributed computing platform and machine learning algorithms.Main aspects of work are listed as below:First,start from the improvement of the bottom design,the physical layer of SCADA is studied,and the distributed processing of each layer is sorted out.On the basis of the this,a new architecture which integrate the distributed computing platform based on the old data center is proposed.The scalable upgrade of SCADA not only ensure the normal operation of the original system functions,but also allows the new computing framework deploymentThen,on the basis of the hybrid architecture,the typical scenario called power load forecasting is taken as the main research object.The k-means ++ algorithm of MLlib is used to cluster the data flow entering SCADA,the distance between the cluster centers are used to detect abnormal data which then will be repaired by the cluster center and normal data belong to the center.Extract the load data from the repaired set and load relevant data records in the SCADA to merge as vector which will pass to the decision tree model and random forest model of MLlib for cross-validate and find out the optimal parameter model.Finally,in order to verify the actual effect of the model,use real load data and load related data provided by EUNITE,construct the workflow based on the Spark Machine Learning Pipeline.The results show that the method is not only superior to the traditional generalization neural network algorithm and also better than the Map-Reduce based Extreme learning and support vector machine prediction algorithm.
Keywords/Search Tags:SCADA, Decision Tree, Random Forest, Spark, K-means
PDF Full Text Request
Related items