Font Size: a A A

Research On Big Data Integration,Storage And Processing Methods For Power Distribution And Utilization Applications

Posted on:2019-06-20Degree:MasterType:Thesis
Country:ChinaCandidate:L T WangFull Text:PDF
GTID:2392330590967296Subject:Electrical engineering
Abstract/Summary:PDF Full Text Request
With the continuous development of smart distribution network and the wide application of intelligent power equipment,the amount of data resources of power distribution and utilization increase rapidly.On one hand,these data not only comes from the internal system of power distribution network such as the production management system,the load control and management system,the power load information acquisition system,the marketing management system,but also includes data from external related systems such as geographic,social economic,environmental systems and others.On the whole,the data present big data features with large volume,multiple types and fast growth.On the other hand,the applications of power distribution and utilization such as load forecasting,network optimization,flood control,and economize on electricity are developing towards intellectualization and lean.How to improve the accuracy,breadth and depth of power distribution and utilization applications by using big data has become a new challenge and opportunity for the power industry.Through the research of multi-source integration,storage optimization,associative queries and parallel processing of power distribution and utilization big data,it can realize the fast data acquisition and sharing while improve the efficiency of data analysis and data mining,providing more efficient technical support for the related applications based on the power distribution and utilization big data.In the face of the widely distributed,multi type and isomerization power distribution and utilization big data,this paper chooses suitable data interaction communication mode according to the characteristics of each data system,to achieve the cross platform migration of multi-source data.Aiming at the problem of isomerization in the process of multi-source integration,standardized metadata and corresponding data dictionaries are adopted to realize the standardized integration of multi-source data.On the basis of data integration,in order to solve the two major problems of big data: efficient storage and fast query,and considering the need of multi-source data association analysis for power distribution and utilization applications,the optimization method of big data storage is studied based on Hadoop.A Hash bucket algorithm considering data correlation is proposed.The algorithm realizes the centralized storage of related data,so as to enhance the efficiency of data query and processing.On the basis of data storage optimization,the parallel association query for multi-source big data of power distribution and utilization based on MapReduce is realized.Tests on a Hadoop cluster show that,after being optimized by hash bucket storage,the multi-source data parallel association query is efficient.Most processing steps under big data environment such as data format conversion,abnormal data identification,data cleaning and so on need complex iterative computation,while later applications have high requirements for data processing efficiency.Since semi structured and unstructured topological data of power grid is difficult to be directly applied,based on the parallel memory computing technology of Spark,the efficient analysis of large-scale distribution network topology data is achieved in this paper.Considering the inevitable existence of massive load data,abnormal situation such as the lack of data and the large fluctuation range will affect the accuracy of applications such as the load spatial and temporal distribution prediction and the optimization of grid structure,a parallel FCM algorithm based on Spark technology is designed and implemented in this paper.It is applied to the identification and correction of abnormal load data.By setting up the Spark experimental environment,the test shows that this method can efficiently and accurately identify and correct abnormal load data.Taking the application of load spatial and temporal distribution prediction as an example,the method proposed in this paper is applied to multi-source integration,storage optimization,association query and parallel processing of the required data,and lay the data foundation for application implementation.
Keywords/Search Tags:big data of power distribution and utilization, Hadoop, hash bucket algorithm, Spark technology, parallel FCM algorithm
PDF Full Text Request
Related items