Font Size: a A A

Land Cover Classification Using The Combination Of Sentinel-2 Multi-temporal Data,Gradient Boosting Decision Tree And Random Forest

Posted on:2022-04-10Degree:MasterType:Thesis
Country:ChinaCandidate:H D LiFull Text:PDF
GTID:2480306482471014Subject:Cartography and Geographic Information System
Abstract/Summary:PDF Full Text Request
Since Sentinel-2 A/B worked together to achieve a 5-days return cycle,it has become easy to obtain multi-temporal data during the year.Images at different times can reflect the different changes of the same feature with the seasons,resulting in more feature subsets,which may help improve the accuracy of land cover classification.After excluding the influence of snow and clouds,Sentinel-2 images of5 different periods in the middle reaches of the Huangshui River Basin in 2019 are selected.Combined its 10 m spatial resolution bands with spectral indexes,texture features and terrain factors to form the multi-temporal dataset of the study area,and use the two ensemble learning methods of random forest and gradient boosting decision tree to classify it.While directly classifying the multi-temporal dataset,in order to simplify the classification model and improve the reliability and interpretability of the model,the particle swarm optimization algorithm was used to select the features of the multi-temporal dataset,and the classification results were compared with the traditional single-phase image classification result.The research showed that:(1)From the perspective of accuracy verification,multi-temporal data can better improve the classification accuracy.Compared with the single-period images in August,the overall classification accuracy of RF has increased by 5.34% to 89.20%;GBDT has increased by 5.60%,The overall classification accuracy is 90.80%.After feature selection through particle swarm optimization,the classification accuracy has not changed much.Compared with the classification results without feature selection,the overall accuracy of RF has slightly increased by 0.58%,which may be due to the lack of applicability of RF to high-dimensional data,while the overall classification accuracy of GBDT has dropped by only 0.10%,almost unchanged.(2)In addition to improving the classification accuracy,the multi-temporal data also helped to extract the other forest lands,which have not been separately distinguished in the previous researches.According to the Standardized Euclidean Distance calculation results between different feature types,the multi-temporal data can greatly increase the inter-class distance of the features,providing more possible decision space for classifiers,this is the fundamental reason why it can obtain better classification results.Taking into account the classification time,under the same hardware and software conditions,The time cost of GBDT and RF was 2.78 times and2.07 times of its single-period image classification,respectively.Comprehensive the accuracy and the time cost,it can be considered that the applicability of multi-temporal data to land cover classification is good.(3)According to the feature bands selected by the particle swarm optimization algorithm,among all the five time phase data,the June data is the most selected,and the overall selection probability of the two methods was as high as 71.88%.From the calculation results of information entropy,the information entropy of the NIR band of the June image was relatively high among all the original bands,which can more comprehensively reflect the spectral characteristics of vegetation,one of the main features of our study area.Compared with the August data with the same higher information entropy in the NIR band,the visible light bands also maintains a higher amount of information,which is more conducive to the distinction between different features;From the perspective of the types of feature bands,texture features played an important role in both RF and GBDT methods,accounting for 39.58% and 55.00% of the total number of feature bands respectively,indicating the importance of texture information to land cover classification,this result was consistent with the conclusions of other studies.(4)From the perspective of classification methods,in the three different data classification scenarios in this article,the overall accuracy of the GBDT that based on the Boosting strategy is 1% higher than that of the RF algorithm,and its applicability to high-dimensional data is also better.But in general,when the classifier has a strong ability,the improvement of the method is extremely limited to the accuracy improvement.In contrast,multi-temporal data is a more effective means to improve the classification accuracy.
Keywords/Search Tags:Land cover classification, Multi-temporal images, Sentinel-2, Gradient boosting decision tree, Random forest, the Huangshui River Basin
PDF Full Text Request
Related items