| Gross primary productivity(GPP)is a crucial flux component in terrestrial ecosystems,exerting significant impacts on global carbon cycling and ecosystem functioning.Accurate estimation of GPP can reveal productivity differences among different regions and plant functional types,evaluate the health status and productivity potential of ecosystems,and have important implications for ecology,climatology,and environmental protection.In recent years,the rapid development of machine learning and deep learning technologies has provided technical support for GPP estimation.This study utilized near-infrared reflectance of vegetation(NIRv),temperature(TA),shortwave radiation(SW),and vapor pressure deficit(VPD)as input variables,based on84 flux tower sites,combined with the FLUXNET2015 dataset and remote sensing data.Four machine learning models,including backpropagation neural network(BPNN),random forest(RF),long short-term memory(LSTM),and convolutional neural networks(CNN)were selected to construct the GPP estimation models.The estimation results of the models were evaluated using four statistical indicators,i.e.,the coefficient of determination(R2),Nash-Sutcliffe efficiency coefficient(NSE),root mean square error(RMSE),and mean absolute error(MAE).This study aimed to identify effective and versatile models for estimating GPP in terrestrial ecosystems with high accuracy.The findings of this study can provide data support for carbon cycling and carbon budget studies,as well as promote sustainable development of ecological society.The main results obtained in this study were as follows:(1)The NIRv exhibited the highest correlation with GPP,with Pearson correlation coefficients of 0.7219 and Spearman correlation coefficient of 0.7224.The correlation between vegetation indices including NIRv,enhanced vegetation Index(EVI),kernel normalized difference vegetation index(kNDVI),normalized difference vegetation Index(NDVI),and land surface water index(LSWI)with GPP was higher than that of environmental factors such as TA,SW,VPD,and precipitation(P).Among the GPP estimation model drivers,the NDVI,EVI,NIRv,kNDVI,fraction of absorbed photosynthetically active radiation(FPAR),and leaf area index(LAI)had different degrees of collinearity issues.NIRv,TA,SW,and VPD all contributed to the GPP estimation models.Among these,NIRv exhibited the most significant contribution to the GPP estimation models,with the information gain ratio of 0.2543.(2)The BPNN,RF,LSTM,and CNN models exhibited satisfactory performance in estimating GPP in terrestrial ecosystems.The LSTM model had the highest accuracy,followed by the CNN and RF models,while the BPNN model showed the lowest accuracy.During the testing phase,the LSTM model demonstrated the R2 ranging from 0.647 to0.862,the NSE ranging from 0.634 to 0.861,the RMSE ranging from 0.645 to 2.428 gC m-2 d-1,and the MAE ranging from 0.405 to 1.342 gC m-2 d-1.Compared to the CNN,RF,and BPNN models,the LSTM model demonstrated superior performance in estimating temporal problems.It possessed stronger temporal information retention capabilities and higher accuracy,making it more suitable for estimating GPP in terrestrial ecosystems.(3)The incorporation of the attention mechanism into the LSTM model,resulting in the long-short-term memory model with an attention mechanism(LSTM-Attention),effectively highlighted variables with high importance and improved the accuracy of GPP estimation in terrestrial ecosystems.As a result,the LSTM-Attention model demonstrated strong generalizability and predictive performance.During the testing period,the LSTM-Attention model achieved the R2 ranging from 0.685 to 0.890,the NSE ranging from0.684 to 0.887,the RMSE ranging from 0.632 to 2.077 gC m-2 d-1,and the MAE ranging from 0.424 to 1.351 gC m-2 d-1.Compared to the LSTM model,the accuracy of the LSTM-Attention model was optimized by 2.14%to 7.47%in R2,2.26%to 9.50%in NSE,2.02%to 27.54%in RMSE,and 0.36%to 4.69%in MAE.The LSTM-Attention model is recommended for estimating GPP in terrestrial ecosystems.(4)Different plant functional types showed significant differences in the accuracy of GPP estimation.The deciduous broadleaf forests exhibited the highest accuracy in GPP estimation,with the average R2 value of 0.854 and the average RMSE value of 1.889 gC m-2 d-1.The croplands C3 demonstrated the poorest accuracy in GPP estimation,with the average R2 value of 0.626 and the average RMSE value of 2.242 gC m-2 d-1.This study explored the reasons of significant differences in the accuracy of GPP estimation among various plant functional types.It suggested that geographical differences,data differences and differences in vegetation characteristics were the main factors that affected the parameters and methods used to develop deep learning models,ultimately impacting the accuracy of GPP estimation.Therefore,to improve model accuracy when using deep learning models for GPP estimation,it is crucial to adjust input variables and model parameters according to different plant functional types to prevent overfitting or underfitting phenomena. |