| In recent years,under the current wave of information innovation and digital development,data has become an essential asset for enterprise development.Facing the exponential growth of data,how to carry out effective management and utilization of enterprise’s enormous data resources and gain timely insight into the potential value behind the data is a practical problem that enterprises need to consider and solve.This dissertation takes the difficulties of data governance and the lack of data correlation query and analysis capability encountered in the process of deepening the construction of the two-level data center of the national power grid as the background,with the purpose of using data analysis technology to explore the value of the data in the center,enhance the quality of data access and the dimension and depth of data monitoring and analysis,so that it can offer a powerful assurance for the construction of the data center.In this dissertation,we first conduct a research on hot and cold data identification algorithm,which uses the temperature of data to measure the different values of data at different stages of its life cycle.To deal with the problems in conventional hot and cold data recognition algorithms,this dissertation presents a hot and cold data identification algorithm based on data temperature,which defines the hot and cold degree of data as the level of data temperature value,quantifies the data temperature by integrating three indicators:data access frequency,access time and access relevance,and takes the data temperature value as a permanent attribute of data to accompany the whole life cycle of data.The experiment proves that this algorithm has better recognition effect compared with LRU and LFU algorithms.Second,for the characteristics of temporality and difference of the midstage data,a data temperature prediction model based on SA-BiLSTM is built in this dissertation.The model brings in the self-attention mechanism in the BiLSTM prediction model,which enables the model not only to mine the temporal information of the midstage data by using BiLSTM,but also to assign different weights to the data feature information at each moment by using the self-attention mechanism to make full use of the disparity characteristics of the midstage data.The experiments prove that the model has lower MAE and RMSE and higher prediction accuracy compared with other prediction models.At last,this dissertation designs and develops a data hotness-based visual analysis system.First of all,a demand analysis of the system is carried out,the system structure is designed,and four modules of functions for the system development are identified;secondly,we screen out important data dimensions and indicators according to the characteristics of the data in the middle table,construct a multidimensional data model using OLAP technology and use Echarts,Saiku and other data visualization technologies to realize the visual query analysis of data temperature situation and important dimensions as well as the real-time monitoring of key data Real-time monitoring of data key indicators,and all-round analysis of the use of data in the center. |