| In order to cope with climate change caused by environmental pollution,countries have strengthened environmental protection policies,promoted clean energy,and reduced dependence on fossil fuels.Therefore,the main task of contemporary electric energy is to promote renewable energy,such as solar energy,wind energy,water energy and so on.In recent years,the development scale of photovoltaic power stations in China has continued to grow rapidly,and the proportion of power generation in the power grid has been increasing.At present,the installed capacity of solar energy ranks first in the world.In the actual output of photovoltaic power station,the time series of photovoltaic data is affected by the random characteristics of weather factors,which is prone to problems such as strong fluctuation of power curve,large difference of power output in different weather types,noise and redundancy of photovoltaic time series.Based on the above problems,in order to improve the reliability and stability of power system.In this paper,the basic principles of photovoltaic power generation system are introduced in detail by analyzing the basic structure of photovoltaic cells and the theory of power generation output.The influencing factors of photovoltaic power prediction are analyzed in depth,and the applicable scenarios and basic principles of photovoltaic power data mining related methods are expounded.At the same time,because the photovoltaic output power is related to many meteorological factors,the traditional prediction method based on weather type classification modeling is difficult to guarantee accurate prediction results.In order to improve the prediction performance of the model and shorten the training time of the model,this paper proposes two short-term photovoltaic power prediction models based on data mining and deep learning for deterministic prediction and interval prediction.Model 1 proposes a prediction method based on the combination of similar days and improved bat algorithm optimized deep belief network(AMBOA-DBN).Firstly,the photovoltaic historical data is divided into three main weather types.According to the strong correlation environmental factors of photovoltaic power generation,the historical data daily rough set samples with high similarity to the predicted day are selected to form the training set of photovoltaic prediction model.Since the output power sequence of similar days is closer to the prediction day,the accuracy of the prediction model is improved by using similar day data as training samples.Then,the adaptive weight is introduced into the bat algorithm,and the AMBOA algorithm is used to optimize the initial weight of the DBN network to improve the randomness of the initial weight of the DBN network,which is easy to fall into the local optimum or the convergence time is too long.Finally,a short-term photovoltaic power deterministic prediction model is established based on AMBOA-DBN combined with similar day principle,and this model is compared with other prediction models to verify the accuracy of the model prediction effect.Model 2 proposes a photovoltaic output interval prediction model based on improved ensemble empirical mode decomposition and quasi-affine algorithm to optimize bidirectional long short-term memory neural network(MEEMD-QUATRE-BILSTM).Firstly,principal component analysis(PCA)is used to reduce the dimension of time series,and then K-means clustering is used to divide the reduced dimension data into three types of meteorological data.Then,MEEMD is used to decompose each type of photovoltaic output sequence and input it into QUATRE-BILSTM photovoltaic power deterministic prediction model.Finally,according to the distribution characteristics of the deterministic prediction error of photovoltaic power,the nonparametric kernel density estimation(KDE)is selected as the probability density distribution function of the fitting error and the confidence interval is solved to obtain the interval prediction model of photovoltaic output.This model is combined with field photovoltaic power station data for verification and analysis to test the effectiveness of the model ’s prediction performance. |