| The objective of this paper is to address the challenges faced in multi-feature-driven streamflow prediction,including missing data,feature redundancy,low prediction accuracy,and unreasonable multi-model combination strategies.To tackle these issues,a methodology that combines machine learning models,feature selection methods,signal decomposition algorithms,and combination prediction strategies is proposed,which can effectively handle daily and monthly-scale streamflow prediction.Through case studies,the feasibility and effectiveness of the proposed methodology are demonstrated.The research results of this paper have significant practical value in streamflow prediction,water resources management and utilization,flood control and disaster reduction,and other related fields.The main research contents and achievements of this paper are as follows:(1)Revealed the runoff characteristics of the study area.Starting from multiple aspects such as the annual and interannual variations,evolving trends,and periodicity of runoff,the changing characteristics of runoff are fully revealed.The research results show that the annual variation of runoff in the study area is uneven,with large interannual differences,and has significant seasonal differences and a sustained downward trend.Multi-year runoff exhibits obvious periodicity characteristics.(2)Improved the research on missing data processing algorithms for inputting machine learning models.Considering the shortcomings of traditional missing data methods,this study proposes the use of a multiple imputation method based on the chained equation random forest(MICE-RF)for missing data processing and applies it to the missing meteorological data analysis of the Foping meteorological station on the Han River.The research results show that compared with traditional statistical and numerical analysis imputation methods,the machine learning imputation method has better imputation effects,and the MICE-RF model has the best imputation effect.The imputation accuracy of multimodels will decrease with the increase of the missing rate,and the machine learning imputation model will increase with the increase of the length of continuous missing.The PCHIP and LINEAR methods will decrease with the increase of the length of continuous missing,while the MEAN and MEDIAN are not affected by the change of continuous missing length.When the data is randomly missing,multimodels have the highest imputation accuracy.Among all missing meteorological variables,variables with larger variance have poorer imputation effectiveness.(3)Conducted research on daily runoff prediction for hydro-meteorological input.This study compares and analyzes the runoff forecasting results of single runoff input and multi-models with fused meteorological features.It uses three feature selection methods:Pearson correlation coefficient,Lasso regression,and VIF,to reduce the dimensions of features,compares the runoff prediction results before and after feature selection,and performs different forecast periods of runoff to verify the stability of the model.The research results show that compared with the model of single runoff input,the model of fused meteorological data has higher forecasting accuracy.Feature selection methods can not only effectively improve the prediction accuracy of machine learning models but also better save the model’s computing resources,making the model more stable.Pearson correlation coefficient is the most effective feature selection method in improving the accuracy of the model.As the forecast period increases,the daily runoff prediction accuracy of the fused meteorological data model gradually decreases,and there is some instability.(4)Conducted research on combined monthly runoff prediction under runoff sequence decomposition.This study combines the EEMD,SSA,and VMD decomposition algorithms with machine learning models for runoff prediction.It introduces sub-model selection process,divides identification period and combination period to improve and optimize the new multimodel combination strategy,and finally uses the CRITIC and minimum error sum weighting methods to combine the prediction results of single models.The research results show that the decomposition algorithm can effectively extract the statistical regularities of monthly runoff sequence,thus significantly improving the accuracy of monthly runoff prediction,among which the VMD decomposition has the best effect.The combined model prediction results have higher accuracy than the optimal single model.Compared with directly combining single models,the prediction accuracy of the multimodel combination method after optimization is higher,and the prediction results are more scientific and reliable. |