Font Size: a A A

Research On Road Extraction Method Of High-Resolution Remote Sensing Image Based On Transformer

Posted on:2024-04-21Degree:MasterType:Thesis
Country:ChinaCandidate:C L MiaoFull Text:PDF
GTID:2542307106468594Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
As a vital infrastructure,road is the key target in remote sensing image.In recent years,the rapid development of satellite launch,aerospace technology and other cutting-edge technologies has also led to the progress of road extraction technology from remote sensing images,and also made road extraction technology widely used in real life,such as traffic planning,smart cities,military operations,and the supplement of geographic information databases.High-resolution satellite data not only provides high-quality ground feature information,but also introduces significant noise,which increases the difficulty of road extraction.In the traditional remote sensing road extraction technology,there have been problems of poor continuity and small road loss.As the most advanced feature extraction method in the field of deep learning,Transformer has great potential.Therefore,this paper focuses on the research of road extraction from high-resolution remote sensing images based on the network based on Transformer.At present,there are two problems in applying Transformer to the field of vision:(1)The scale of the object changes greatly.For example,in road extraction,the coverage of roads may be in the range of 5%-95% or even more.At this time,multi-scale hierarchical feature map is crucial.(2)The high resolution remote sensing image has many pixels,and the calculation amount is too large.Especially for the task of intensive prediction,the calculation amount of global attention adopted by the native Transformer is proportional to the square of the image size,which is too heavy a calculation burden for high-resolution remote sensing images,and leads to too long model reasoning time.In view of the above two problems,this paper launched the research on the road extraction model of highresolution remote sensing image based on Transformer.1、 In order to solve the problem of large changes in remote sensing road scale,this paper proposes a high resolution remote sensing image road extraction model based on Transformer.First of all,the model explores the impact of different scale embedding and double-branch structure on the road extraction effect of remote sensing images from a unique perspective,and inserts patches of different scales into the transformer encoder through different network branches to complete the feature extraction.Secondly,in order to fully integrate the feature representation of different branch network outputs,this paper proposes a multi-scale feature fusion module,which will fuse the output features from the sub-network to obtain a more expressive feature representation.Finally,in order to alleviate the problem of training convergence caused by the imbalance of positive and negative samples,this paper uses a linear combination of two loss functions,which can maintain the stability of the gradient during the back propagation and prevent falling into the local optimal solution,and searches the optimal hyperparameters in the mixed loss function through experiments.The model achieves a higher intersection over union(Io U)ratio of 65.36% and 56.74% on the Massachusetts dataset and Deep Globe dataset,respectively.In terms of visual effect,the continuity of road extraction and the problem of detail loss have been significantly improved.2、 In view of the high computational complexity of high-resolution remote sensing images,this paper proposes a hybrid attention mechanism with linear time complexity,and constructs a road extraction model with faster reasoning speed.The core idea of mixed attention is to connect Transformer with strong modeling ability and a priori of visual signals.Secondly,this paper further expands the design of double-branch structure.This paper explores the effect of multi-scale embedding on the transformer-based model,and through the thermal diagram response analysis in the ablation experiment,it is concluded that the small and medium-sized patch embedding pays more attention to the capture of details and narrow roads,while the large patch embedding pays more attention to the overall,background and global response.The experimental results show that the model obtains 67.36% Io U on the Massachusetts data set,and improves the reasoning speed of the model from 2.45 ms of the Vi T model to 1.86 ms while ensuring the effect of road extraction.
Keywords/Search Tags:road extraction, remote sensing image, Transformer, deep learning
PDF Full Text Request
Related items