| With the innovation and development of sensor technology,the resolution of remote sensing satellites has entered the sub meter era.Survey and analysis of artificial ground object information from high-resolution remote sensing images are key steps in the field of remote sensing,among which building is one of the most important artificial ground objects.Extraction of buildings are indispensable in many fields such as urban planning,intelligent building survey,military reconnaissance,high-precision map drawing.A lot of intelligent methods based on deep learning are proposed by researchers in recent years.Among them,the semantic segmentation branching models are widely used to extract buildings,but most of these methods have shortcomings such as heavy models and poor learning ability.Under this background,this study proposes a semantic segmentation model based on feature de redundancy,relying on two building image datasets to train and optimize the proposed model.To sum up,the contributions of this paper mainly include:(1)In this paper,we use the encoder-decoder convolution neural network which is more appropriate to the building extraction as the underlying framework of the model.By analyzing the underlying logic of the encoder,we find that encoder with standard convolution kernel consume a lot of convolution computation resources and produce redundant feature maps when learning the characteristics of the building.Therefore,we improve the standard convolution logic by using the Ghost module to extract the building feature maps.This method can generate redundant feature maps by cheap linear transformation,thus greatly reducing the parameter and FLOPs of the model.(2)Based on the datasets of this study,background pixels regarded as counterexample occupy most of the image area which make the learning bias of the model to the background pixels.Therefore,we first lead the CBAM(Convolutional Block Attention Module)module into encoder,which can assist encoding to make module pay more attention to buildings pixels from spatial and channel dimensions.Next,the FPEM(Feature Pyramid Enhancement Module)is used to up-down-scale enhance and fuse the multi-scale feature maps generated by the encoding,thus improving the learning ability of the encoder to the building features.Finally,the Focal loss function is used to balance the loss weights of the building and background classes in the backpropagation,so that the learning results of the model are no longer biased towards the background class.(3)To verify the theoretical and inferential results of the model,several semantic segmentation models based on encoder-decoder structure is selected as the control group for experimental analysis.First,the segmentation performance of all models is roughly evaluated from building mask.Next,the classification ability of each model is evaluated by the judgment type of confusion matrix,and the performance evaluation of each model is done from the data dimension by using multiple semantic segmentation evaluation indicators based on confusion matrix.Finally,the resource consumption of each model is compared to analyze the lightweight of models.In addition,the ablation study is carried out to find the best combination mode of attention mechanism.Using the image of the mining area as the real test datasets,the actual operation effect of the method is evaluated and analyzed.In this study,we use Unet semantic segmentation model to realize automatic extraction of buildings.Encoding with feature de redundancy reduces the overall parameter of the model by 12 times,and reduces the FLOPs by 83 times.The CBAM and FPEM make the model pay more attention to building targets,thus improving the accuracy of the model.Finally,the model achieves the best results in masking quality evaluation and semantic segmentation evaluation index compared with other models,and the generalization ability is the strongest in mining area building test set. |