Font Size: a A A

Research On Semantic Segmentation Of Complex Scenes

Posted on:2023-04-15Degree:MasterType:Thesis
Country:ChinaCandidate:Y Y ZhouFull Text:PDF
GTID:2568306917979149Subject:Control theory and control engineering
Abstract/Summary:PDF Full Text Request
With the rapid development of unmanned autonomous systems and artificial intelligence,semantic understanding of scene has become a hot research topic.Unmanned autonomous devices such as UAV and UGV are widely used in precision agriculture,urban planning,electric power inspection,and rapid rescue,etc.Semantic segmentation is an important technical means for unmanned autonomous devices to perceive scene information.This thesis constructs corresponding semantic segmentation models for three complex scenes,namely,urban,suburban and road,to accomplish the task of semantic understanding of complex scenes.Semantic segmentation research at home and abroad has achieved great results,but the following problems still exist in practical applications: 1)There are more small-scale objects in urban scenes,and the existing networks are less effective in segmenting small-scale objects;2)When the UAV is used to observe suburban scenes,the UAV pose change causes the scale change of ground objects to be too large,which affects the segmentation accuracy;3)For the real-time semantic segmentation task of unmanned vehicle road scene,it is difficult to take into account the segmentation accuracy and real-time performance.By analyzing and summarizing the shortcomings of existing semantic segmentation algorithms,this thesis focuses on the following three aspects of research work:(1)In order to achieve high-precision semantic segmentation of urban scenes,this thesis constructs a semantic segmentation algorithm based on the Self-Guidance Attention module.To address the problem of poor segmentation of small-scale objects,the algorithm introduces the Self-Guidance Attention module on the basis of the original DeepLabv3+model.The module optimizes feature information in both spatial and channel dimensions,and then uses the pooling fusion module for fusion to improve the utilization of image feature information.Semantic segmentation experiments of urban scenes are conducted on three datasets,CityScapes,UDD and BDD100 K,to verify the accuracy and generalization ability of the improved model in this thesis.Finally,different network comparison experiments and SGAD module ablation experiments are also conducted to verify the advancedness of the improved network and the improvement effect of SGAD module on the original network in this thesis.(2)In order to achieve high precision semantic segmentation of suburban scenes on aerial platforms,this thesis constructs a semantic segmentation algorithm based on bidirectional multi-scale attention.For the problem of excessive object scale changes caused by UAV position changes,the algorithm first uses a bilinear interpolation sampling method to generate multi-scale images,then bidirectional fuses multi-scale image features,and adopts ResNet101 as the backbone feature extraction network,so as to improve the adaptability and effectiveness of feature extraction.Finally,semantic segmentation experiments of UAV suburban scenes are conducted on three datasets,namely AeroScapes,SSbenchmark and GID,to verify the segmentation effect of the bidirectional multi-scale network structure from various experimental perspectives.Finally,comparison experiments of different networks and ablation experiments of the backbone feature extraction network are also conducted to verify the advancedness of the bidirectional multi-scale network structure and the higher segmentation accuracy of ResNet101 as the backbone network in this thesis.(3)To achieve real-time semantic segmentation of road scenes,this thesis constructs a semantic segmentation algorithm with lightweight codecs.To address the problems of large number of existing segmentation network parameters and slow segmentation speed,the algorithm uses an asymmetric codec network structure to reduce the number of network parameters and improve the segmentation efficiency.The encoder part uses the Split-Shuffle-non-bottleneck module to improve the feature extraction ability of the backbone network,and the decoder part uses the attention pyramid module to improve the accuracy of the segmentation network.Finally,the prediction experiments of road scenes are conducted in three datasets,CamVid,KITTI and ApolloScape,and the overall evaluation metrics of mIoU all reach over 72% and the average segmentation speed all reach over 55 FPS,verifying the accuracy and real-time performance of the algorithm.The efficiency of the lightweight segmentation algorithm in this thesis is verified through the comparison experiments of different lightweight segmentation networks and the ablation experiments of APN modules.
Keywords/Search Tags:complex scene, semantic segmentation, attention module, multi-scale, lightweight
PDF Full Text Request
Related items