Font Size: a A A

Research On Remote Sensing Image Scene Classification Based On Self-attention Mechanism

Posted on:2024-03-23Degree:MasterType:Thesis
Country:ChinaCandidate:Z Y LiuFull Text:PDF
GTID:2542307100962119Subject:Computer technology
Abstract/Summary:PDF Full Text Request
The continuous development of satellite remote sensing technology has led to an increasing resolution of remote sensing images.Remote sensing images contain rich information on the shapes and spatial distribution of land features,and their accurate interpretation can provide theoretical basis and scientific guidance for fields such as smart agriculture,earth observation,and urban planning.Scene classification is one of the fundamental technologies for interpreting remote sensing images,aiming to infer the correct scene category by analyzing the land information in remote sensing images.Due to the combined effect of various factors such as the diversity of land features,environment,and data acquisition methods,remote sensing images have the problem of "high inter-class similarity and large intra-class variability",which poses great challenges to scene classification.In recent years,classic deep learning classification algorithms represented by Convolutional Neural Networks(CNNs)and Transformer have made significant progress in addressing this challenge.However,these algorithms still suffer from weaknesses such as weak global modeling ability,high computational complexity,and poor real-time performance.To address this issue,this thesis studies the scene classification technology of remote sensing images based on self-attention mechanism and proposes several new deep learning models and network lightweighting methods.The main research contents of this thesis are as follows:(1)To address the problem that CNNs cannot globally model images,a remote sensing image scene classification method based on self-attention and CNN is proposed.In the VGG19 model,the last 4 convolutional layers are replaced by two cascaded selfattention modules to enhance the model’s ability to extract global information.In addition,batch normalization layers are added within the model to accelerate convergence and improve generalization performance.To reduce model complexity,the number of fully connected layers is also reduced.Through these optimization measures,the model’s lightweight property is maintained while improving the accuracy of scene classification.(2)To address the issues of high computational complexity and dependence on large-scale data for Transformer models,a remote sensing image scene classification method based on CNN-enhanced Transformer encoder is proposed.Firstly,CNN is used to pre-encode images,transforming high-resolution images into low-dimensional sequence data.Then,the Transformer encoder is employed to capture long-range dependencies in the sequence.This design not only effectively reduces model complexity but also greatly improves the model’s classification performance.In addition,the introduction of CNN also enables the Transformer encoder to obtain good inductive biases,thereby avoiding excessive reliance on large-scale data.(3)To address the problems of large model size and poor real-time performance in deep learning models,a remote sensing image scene classification method based on knowledge distillation technology is proposed.A parameter-free attention module is added to the lightweight Mobile Net V2 model to extract key information from the image while ignoring irrelevant information.Secondly,the teacher model “Swin Transformer”is used to guide the training of the student model “Mobile Net V2”,to enhance its feature extraction capabilities.After optimization,the Mobile Net V2 model has the characteristics of being lightweight and efficient,making it suitable for resource-limited mobile devices.
Keywords/Search Tags:Remote Sensing Image Scene Classification, Convolutional Neural Network, Self-Attention Mechanism, Knowledge Distillation
PDF Full Text Request
Related items