| The research on building extraction from remote sensing images is of great significance to military investigation,land survey,urban digital construction,etc.The traditional building extraction method based on artificial design features has many drawbacks.On the one hand,it is easy to be affected by factors such as high altitude shooting angle,complex background environment interference,different target scales and different shapes,and on the other hand,it is inefficient in processing large amounts of data and has poor generalization ability in real scenes.In recent years,many scholars have tried to use deep learning semantic segmentation algorithms to efficiently complete the task of building extraction.This study proposes two semantic segmentation algorithms for feature analysis from the perspectives of multi-scale information and context semantics of building remote sensing images.The main work is as follows:(1)In order to solve the problems of large parameters of the classical SegNet network,such as gradient disappearance or explosion,poor building feature extraction ability and lack of context semantic information,an improved encoding and decoding structure model E-SegNet(Effective SegNet)based on the classical SegNet is proposed.In the coding stage,the designed separable residual block is introduced to solve the problem that the model is difficult to train,and the deep semantic information is extracted at the same time.The designed multi-scale context attention module is introduced to capture long-distance context dependence while multi-scale feature fusion.In the decoding stage,the designed feature fusion module is introduced to filter the redundant information of jump connection,which alleviates the semantic gap between different levels.(2)Aiming at the difficulties of complex background and large scale change of buildings in high-resolution remote sensing images,and the problem that common semantic segmentation networks can not effectively use multi-level information and multi-scale information,a multi-scale feature fusion network(MFF-Net)is proposed to improve multi-scale fusion,so as to achieve better multi-scale feature representation.MFF-Net uses the improved Res Net50 as the backbone network,and introduces a designed multi-scale fusion structure between up sampling and down sampling to fuse different levels of feature information and further extract multi-scale features to obtain stronger semantic expression.At the same time,it introduces convolutional attention modules to establish relationships in space and channels,improve the quality of feature extraction,and introduce pyramid pooling modules to enhance the understanding of complex scenes.(3)The experimental results show that on the Massachusetts building data set and the WHU building data set,the evaluation indexes of E-SegNet are superior to other comparison models,and the calculation amount and parameter amount of E-SegNet are lower than those of SegNet,and it is not sensitive to parameter changes.Compared with other advanced models,MFF-Net has achieved higher segmentation accuracy,good complexity and robustness,and stronger generalization ability for different scenes. |