Font Size: a A A

Research On Semantic Segmentation Technology Of Aerial Images Based On Deep Learning

Posted on:2022-10-05Degree:MasterType:Thesis
Country:ChinaCandidate:M L LiFull Text:PDF
GTID:2492306509997629Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
In recent years,the amount of remote sensing image data is increasing,thus the accurate and automatic analysis of aerial images has become an urgent problem to be solved.Image semantic segmentation aims at dividing the input image into nonoverlapping regions and labeling each region with different semantic labels so as to obtain the fine-grained parsing results.Using semantic segmentation technology to accurately and automatically divided aerial images into different parts could help the accurate localization of affected roads,bridges and houses in major natural disaster areas,so as to provide guidance for the following disaster rescue.Based on deep learning technology,this paper mainly focus on improving the segmentation accuracy and accelerating the inference speed.The main contributions of this paper are as follows:(1).Aiming at solving the problem of low segmentation accuracy caused by two kinds of classification errors when the common semantic segmentation model is applied to remote sensing scenes,a Global-Local Attention Network is proposed and named as GLANet in this paper.GLANet is dedicated to eliminating both two types of errors in the semantic segmentation task of aerial images at the same time.When applying the classical segmentation models represented by Fully Convolutional Network into aerial scenes,the classification errors could be summarized into two categories: large area misclassification in big objects and inaccurate local boundaries.Previous attentionbased methods typically capture rich global contextual information,which benefits the large area classification but cannot address the local errors of boundaries.GLANet proposed in this paper could simultaneously consider the global context and local details.Specifically,GLANet consists of two branches: the global attention branch and the local attention branch.Furthermore,three different modules are embedded in GLANet for modeling the semantic interdependencies in spatial,channel and boundary dimension,respectively.Lastly,the outputs of different branches are merged together to enhance the feature representation further.Thanks to the rich global context information and local context information extracted by the two branches,GLANet could improve the segmentation accuracy of both large objects and local boundaries simultaneously.This paper conducts comprehensive experiments in two popular aerial scene segmentation datasets,Vaihingen and Potsdam,and the experimental results demonstrate that GLANet could achieve higher segmentation accuracy compared with the existing work.(2).Aiming at solving the problem that deep learning models with high segmentation accuracy are often suffer from low inference speed due to the model complexity,this paper propose a novel Dual Relation Distillation framework and name it as DRD,DRD could shrink the gap between student model and teacher model to a greater extent,thus get trade-off between segmentation accuracy and inference speed better.In recent years,the accuracy of CNN models for semantic segmentation has been significantly improved.However,models with high segmentation accuracy are very heavy and generally suffer from low inference speed,which limits their application scenarios in practice.One promising way to achieve a good trade-off between segmentation accuracy and efficiency is knowledge distillation.This paper study knowledge distillation and propose a novel dual relation distillation framework to transfer both spatial correlation and channel correlation in feature maps from the cumbersome model(teacher)to the compact model(student).Concretely,DRD compute spatial relation maps and channel relation maps separately for teacher and student,then align corresponding relation maps by minimizing their distance.Out of the difference of model complexity and number of parameters,teacher usually learns more knowledge thus collects richer spatial and channel correlations than student.Transferring these correlations from the teacher model to the student model could help the student mimic the teacher better in terms of feature distribution,and thus improve the segmentation accuracy of the student model.This paper evaluates the proposed DRD on two widely adopted benchmarks in remote sensing field(Vaihingen and Potsdam),the experimental results demonstrate that the novel distillation framework proposed in this paper is able to significantly boost the performance of student network without incurring extra computational overhead.
Keywords/Search Tags:Semantic Segmentation, Deep Learning, Self-attention Mechanism, Knowledge Distillation, Aerial Image Processing
PDF Full Text Request
Related items