In recent years,with the continuous acceleration of urbanization and the rapid development of artificial intelligence,unmanned driving tasks in urban road scenes have received widespread attention from the industry.Due to the safety requirements of unmanned driving tasks and the complexity of urban road scenes,achieving real-time and accurate environmental perception by machines has become the research focus of unmanned driving tasks.Semantic segmentation task as the basis of environmental perception has great significance for the unmanned driving field.However,using existing semantic segmentation networks for high-precision environmental perception requires a lot of hardware resources,which is not conducive to practical application,and the existing lightweight segmentation network greatly affects the segmentation accuracy,so it is of great research significance and application value to design a real-time semantic segmentation network that balances inference speed and segmentation accuracy to achieve efficient perception of urban road scenes.Due to the large number and diverse sizes of objects in urban road scenes,the segmentation network in urban road scenes needs to extract sufficient multi-scale context information and spatial information.Although the existing two-branch real-time semantic segmentation network uses context branches and spatial branches to extract context information and spatial information independently,but it still has problems such as redundancy of shallow feature extraction structure,and poor interaction of dual-branch information,which fail to balance the network’s inference speed and segmentation accuracy effectively.Therefore,this paper optimizes the dual-branch network architecture and designs the dual-branch interactive network for real-time semantic segmentation,which includes three aspects: first,the fast downsampling module is proposed to establish the shared backbone to reduce the structural redundancy of shallow feature extraction;Secondly,multi-layer crossing connection residual unit is proposed to establish context branch,and adding multi-level cascaded pyramid pooling modules to fully extract multi-scale context information;Finally,the correlation guided interaction module is proposed to promote the interaction between context branch and spatial branch,helping the spatial branch accurately resolve spatial information.This network structure of two-branch interaction effectively solves the segmentation problem of multi-scale objects,and realizes the balance between inference speed and segmentation accuracy.In addition,objects in urban road scene also have the characteristics of diverse shapes,and the spatial position relationship between objects is extremely complex,these factors lead to the existence of certain areas in the scene that are difficult to segment accurately,such as areas where strip objects and connected or occluded objects are located.Aiming at these difficult areas,this paper designs the shape perception and edge assistance network for real-time semantic segmentation based on the dual-branch interactive network for real-time semantic segmentation.Specifically,the segmentation effect of strip objects is improved by introducing the strip context-aware module to the context branch.At the same time,the edge auxiliary supervision path is constructed by using the multi-scale feature map,and the extraction of edge information is designed as an independent pixel binary classification task,and using edge information as constraints to enhance the fusion of context information and semantic information.This approach further enhances the performance of the network in difficult areas at a slight sacrifice in inference speed. |