| In the field of photovoltaic power prediction,the time scale of ultra-short-term prediction is relatively short,focusing on the real-time fluctuation of photovoltaic output under complex weather conditions.Studying the factors that affect the output power of photovoltaic systems and making accurate power predictions in time is related to the safe and stable operation of the power system,and also has important guiding significance for the dispatch and management of grid-connected photovoltaics.First,this paper preprocesses the photovoltaic data and corrects the predicted irradiance,uses the 3-sigma theory and the irradiance-photovoltaic power engineering experience curve to identify abnormal data,and then repairs the abnormal data based on the nearest neighbor algorithm.Through the Pearson correlation coefficient and XGB Feature Importance Score to analyze the features qualitatively and quantitatively,the study found that the irradiance is the most important factor affecting the photovoltaic power,so the forecast irradiance is corrected based on the BP neural network.In addition,the photovoltaic data set was expanded by the upscaling technology,and all the data were normalized.Secondly,clustering analysis is carried out based on the revised photovoltaic data.This paper proposes a clustering method based on historical photovoltaic power envelope.By establishing the photovoltaic power envelope,the envelope area,the number of crests and valleys,and the average value of the envelope are selected.As a clustering index,it is used to measure the fluctuation range and frequency of the photovoltaic power curve,so as to accurately grasp the real-time fluctuation of photovoltaic power.The calculation of the contour coefficient shows that the clustering method based on the historical photovoltaic power envelope has better clustering performance than the meteorological factor clustering method and the power interval clustering method.Thirdly,in the construction of forecasting models,in order to solve the problem of large errors in traditional single forecasting models,this paper comprehensively considers the discreteness and time series to make forecasts.Discrete prediction builds an integrated learning prediction model based on the stacking framework on the basis of envelope clustering.Time series prediction builds LSTM prediction model based on all data.According to the ratio of 0.7 and 0.3,the fusion of discrete and time series forecasting models has realized the double enhancement of accuracy and generalization.This paper also uses cross-validation and grid search techniques to prevent the risk of model overfitting and to ensure the best performance of the predictive model.Finally,the simulation analysis shows that when the PV power envelope is selected as the clustering index,the prediction accuracy is significantly higher than that of meteorological factors and power interval clustering method,and the prediction effect of the fusion machine learning model is generally better than that of the single algorithm model. |