Font Size: a A A

Hyperspectral Image Classification Based On Vision Transformer

Posted on:2023-09-15Degree:MasterType:Thesis
Country:ChinaCandidate:X Y ChenFull Text:PDF
GTID:2532306908450834Subject:Engineering
Abstract/Summary:PDF Full Text Request
Hyperspectral images,captured by remote sensing equipment,consist of hundreds of bands and contain more information than conventional images.The task of hyperspectral image classification is to assign a category label to each pixel in order to obtain the feature distribution of the whole image.The classification results are helpful for environmental protection,geological exploration,forestry planning and military deployment.In the hyperspectral image classification task,the most commonly used methods are deep learning methods.Convolutional Neural Network based method has good classification performance.While the Convolutional Neural Network model can extract local spatial features,it often cannot capture nonlocal correlation between spectrums,and the spectral information in hyperspectral image is not dig deep enough.Recently,Vision Transformer,an architecture that works differently from Convolutional Neural Networks,has been applied in the field of image processing.Vision Transformer’s design lies in the self-attention mechanism to simultaneously capture the dependency information of both long-distance and short-distance,using global information to improve the ability of feature identification.However,the application of Vision Transformer in hyperspectral image processing is still in the early stage and needs further research.This thesis mainly includes the following aspects:1.From the perspective of model optimization,referring to the feature extraction structure of Convolutional Neural Network,the multi-stage Vision Transformer is proposed.The input of the proposed model is a sequence of image patches to introduce spatial features.At the same time,the spectral attention module is added to realize the learning of spectral features,so as to improve the classification accuracy.The training data set is expanded by generating reliable virtual samples.Compared with other Vision Transformer based methods,the multi-stage Vision Transformer with expanded samples based on spectral attention can achieve better results in hyperspectral image classification tasks.2.From the perspective of algorithm improvement,it is observed that the architecture based on the Convolutional Neural Network focuses on local feature extraction,while Vision Transformer can fully mine global information.In order to make full use of the diversity and complementarity of features extracted from the two models,a new strategy called CNNTransformer Co-learning is proposed through the dual-architecture ensemble design:iteratively adding high-reliability samples to the next round of training,to improve the difference of training samples between different rounds.Experimental results show that this method is better than that using only one network.In addition,a data augmentation method is used in this chapter to improve training accuracy.3.From the perspective of input form,the original Vision Transformer architecture generally has two forms of input:spectral sequences or image patches.However,when the input is a spectral sequence,the information contained in the spatial domain cannot be processed.Similarly,when the input is a sequence of image patches,the model cannot analyze the spectral information.In this chapter,a cascaded spectral-spatial Vision Transformer model is proposed,in which the features contained in the spectral domain and spatial domain are extracted separately and analyzed jointly,which can make full use of complementary information from the spatial domain and the spectral domain to improve classification results.Experiments prove that this method can improve the accuracy of hyperspectral image classification.Through the above research on Vision Transformer,by combining spatial domain and spectral domain information,mining the correlation of spectral information,and applying data enhancement and attention mechanism,the progress is made on the application of Vision Transformer in hyperspectral image classification.
Keywords/Search Tags:hyperspectral image classification, Vision Transformer, convolutional neural network, data augmentation, attention mechanism
PDF Full Text Request
Related items