| Surface building extraction has great practical significance for today’s urban planning and economic development,Earth observation satellites and drones can provide a large number of high-score image data sources,with the increasing volume of remote sensing image data,the difficulty of image processing has shifted from difficult acquisition to low processing capacity.Deep learning neural network has the ability to independently analyze the detailed information carried by remote sensing image data,and many scholars have applied it to the field of semantic segmentation of remote sensing images and achieved certain results,but there are still limitations such as low extraction accuracy and poor post-processing effect.Aiming at these problems,this paper improves the typical semantic segmentation network for building extraction research to achieve accurate and efficient extraction of buildings in the UAV data set.In this paper,the current situation of deep learning building extraction research at home and abroad is investigated,and Yushu Tibetan Autonomous Prefecture in Qinghai Province is selected as the research area.Remote sensing images are obtained by using unmanned aerial vehicle operations,image slices are obtained by sliding window cropping.The data complexity is enhanced by random erasing data enhancement operations to prevent overfitting of the model,and the network generalization ability is enhanced.Then the label data is manually labeled and made,and create the UAV image dataset.Based on the UAV dataset and the WHU dataset,the building extraction effect of the four classic semantic segmentation networks of PSPNet,SegNet,Deeplabv3+ and U-Net is compared;the performance of the model on the validation set is calculated during training,and the training is stopped when the performance of the model on the validation set begins to decline in combination with the early stop method training technique,the morphological post-processing of the extraction results was carried out to make the building contour smoother,small voids and shadow debris in non-building areas of the building area are removed and break lines are bridged.Experimental results show that U-Net network has strong robustness and universality.Aiming at the problems of complex background of remote sensing images,large differences in the scale of target figures,and the phenomenon of missed detection and misexamine,the U-Net network is selected to combine the dilated convolution,regularization method and scSE attention module to improve the network architecture.The introduction of the dilated convolution is to expand the network sensory field,improve the utilization efficiency of background information of various scales,and enhance the network’s ability to express complex features;the introduction of regularization method is to solve the network overfitting problem and improve the network generalization ability;the introduction of scSE attention module is to effectively extract significant information on the feature map and strengthen the network’s ability to identify building features,reduce semantic differences between shallow features and deep features.In order to explore the specific effects of different improvement strategies on the experimental results,the improved network ablation experiment based on the UAV dataset shows that the dilated convolution and regularization methods increase the overall accuracy of the U-Net classic network by 2.2%,F1-Score by 0.9%,intersection over union by 1.9%,the scSE attention module increases the overall accuracy of the U-Net classic network by 2.9%,F1-Score by 0.4%,intersection over union by2.1%,kappa coefficient by 2.8%,the addition of all improvement strategies increases the overall accuracy by 3.6%,the F1-Srore by 2.8%,intersection over union by 2.9%,kappa coefficient by2.9%.Data proves that each improvement strategy in this paper has significance and efficacy,and the improved U-Net model has stronger generalization capabilities and higher extraction accuracy. |