Research And Development On Big Data Application Of The State Grid Audit System

Posted on:2018-09-29

Degree:Master

Type:Thesis

Country:China

Candidate:Z Xu

Full Text:PDF

GTID:2359330518455520

Subject:Computer technology

Abstract/Summary:

PDF Full Text Request

With fast promotion of information technology in power systems,the volume of data acquired every day is growing rapidly,and a diversity of data from different sources can reach the level of PB in size.Facing these challenges it is imperative to research on big data applications in power systems and develop power system oriented big data analysis platforms.This paper first reviews one of the time consuming data calculations in the current state grid auditing system,and then implements corresponding computations through a pilot hadoop development environment to verify viability of big data applications in the state grid aud iting system,and therefore proposes a big data solution to optimize the calculations in the auditing system.The pilot environment consists of a 15-node hadoop cluster,data are tranferred by sqoop into the Hive data warehouse over the hadoop distributed file system in the cluster.A mass data query test is conducted by separately using Hive QL and Spark SQL to perform a set of specified queries over the same huge dataset within the distributed processing framework of Map Reduce aiming at facilitating massive data query and analysis.The test result shows that the Hadoop distributed architecture has good scalability to meet the needs of rapid growth of data processing in the state grid auditing system.It also shows that the more data,the more obvious advantage is,and the higher efficiency Spark query than that of Hive.As a key class of algorithms in data analysis and data mining,clustering analysis has been widely used in many fields.For the seek of a holistic auditing optimization of thinking-contents-objectives-technology application and turning a traditional verification auditing into a risk based preventive auditing,clustering algorithms are to have a huge playground.In front of ever-growing data,K-means as the most widely used partitional clustering algorithm in practice,and Hadoop as a widely used parallel computing model nowadays,both are very attractive to researchers and developers.It makes sense to find out a better way to implement K-means using parallelization of Hadoop platforms.This paper summarizes principles of K-means algorithms together with Map Reduce distributed computing model and put forth a Java implementation of K-means algorithm on hadoop Map Reduce.Through algorithm correctness validation,cluster acceleration evaluation an d cluster expansion rate verification,this paper confirms that the improved K-means algorithm,besides its highly efficiency and expansibility,can effectively make use of powerful parallel computing capability of Hadoop platforms thus it can be used in d eveloping a more intelligent state grid auditing system in the future.

Keywords/Search Tags:

Power Big Data, Intelligent grid, distributed storage, parallel computing, Auditing, clustering, K-means

PDF Full Text Request

Related items

1	Intelligent Upgrade Program Design And Comprehensive Evaluation Of Yanshi Rural Power Grid
2	Research On The Relationship Management Of The Power Grid For Terminal Users In The Mode Of "Internet +"
3	Subsidies Of Distributed Wind Power Based On Micro Grid
4	Analysis Of Customer Value Based On K-Means Clustering
5	Interestsevolution In Grid Connection Of Distributed PV Generation
6	Financial Data Analysis Based On Parallel Statistical Computing
7	Research On Hadoop Distributed Computing Platform For Power Application
8	Small And Distributed Logistics Service Collaboration In Cloud Computing Environment
9	Implementation Of Integration Short Leadtime Program In Enterprise Supply Chain Management System Based On Distributed Technology
10	Research On Power Grid Evaluation Based On Comprehensive Weighting Method And Clustering Analysis