With the continuous development of digital and information construction of State Grid Corporation of China,the amount of data in 220kV substation projects has grown geometrically,and abnormal data has increased sharply,which poses challenges for project management.The data sources of 220kV substation projects are extensive and hierarchical,and the correlation between data is complex.It is stored discretely in files at different stages of the project,so the efficiency of data retrieval is low.and integration is difficult due to its outdated data storage format.In addition,the causes and manifestations of abnormal data in 220kV substation projects are complex and varied.The current processing method often relies on manual screening,which is difficult to guarantee data quality and processing capacity is insufficient.Graph data has strong extensibility due to its non-structured data structure composed of nodes and edges,and can deeply explore the internal correlation of data.It has strong adaptability in the direction of data storage and abnormal data processing for power data.This article focuses on the research of data storage and anomaly processing for 220kV substation engineering of State Grid Corporation of China.A graph-based exceptional data processing system is designed,which includes two parts:data storage architecture and anomaly processing architecture.The data storage architecture constructs a graph model of 220kV substation engineering common data model,engineering files and other data from the perspective of information storage mode and constraints,and proposes a graph database construction method based on this.The anomaly processing architecture classifies the 220kV substation engineering data into numerical type and text type,and proposes the HV-DBSCAN algorithm by comprehensively considering the statistical and density features of the data set from two dimensions for numerical type data.This algorithm efficiently detects numerical anomalies and improves the accuracy of anomaly processing.For text type data,the HAC-TextRank algorithm is improved by considering the frequency and semantic features related to natural language processing,and the idea of hierarchical clustering before information extraction is proposed.The algorithm improves feature mining depth,algorithm execution efficiency,extraction result quality,and text adaptability,and filters redundant data through automatic text information extraction,thus improving the efficiency of anomaly processing.Finally,the data storage architecture and anomaly processing architecture are organically integrated into a web system,and a visualization module is added to construct an exceptional data processing system suitable for the background of 220kV substation engineering.The results of practice indicate that the exceptional data processing system innovatively applied the principles of graph data,which reduced the difficulty of data management for 220kV substation engineering,improved the efficiency of exceptional data processing,and effectively supported the digital and information construction of the State Grid Corporation of China. |