Font Size: a A A

Dense Depth And Semantic Estimation Under Road Environments

Posted on:2021-03-09Degree:DoctorType:Dissertation
Country:ChinaCandidate:N ZouFull Text:PDF
GTID:1362330614467747Subject:Information and Communication Engineering
Abstract/Summary:PDF Full Text Request
Along with the upsurge of artificial intelligence research,autonomous driving as one of the important indicators for the development of artificial intelligence,has become a hotspot in recent years.In autonomous driving,the core components of the system are usually composed of perception,planning and control.Among them,scene understanding,the most important task in perception,refers to the accurate and efficient analysis of the external environment around the vehicle through computer vision and other methods.With the continuous innovation and development of deep learning related technologies in recent years,scene understanding methods based on deep neural networks gradually show better performance than traditional methods.However,in the deep neural network designs for scene understanding tasks,there are still many problems worth exploring.For example: Is there a more efficient convolution method for the underlying feature extraction for a specific subtask? How to effectively combine the data source information of heterogeneous sensors such as on-board lidar and color image to design a more effective deep neural network structure? How to make good use of the inherent relationship between different subtasks of the scene understanding to build a joint,unified,and better performance solution?In general,full scene understanding in autonomous driving can be divided into two aspects,geometric and semantic.The geometric estimation is mainly represented by the depth estimation of the scene.In order to meet the high precision and high resolution requirements of today's autonomous driving,both geometric and semantic estimation must reach a dense pixel level.Therefore,this paper focuses on two closely related scene understanding subtasks: dense depth estimation and semantic segmentation.Based on deep learning method,by using the basic convolutional template and convolution method in the neural network as the focal point and the connection bridge between the two subtasks,this paper focuses on three aspects: in the research of the multi-source data utilization,the heterogeneous data of lidar and color image is combined to construct an effective network structure;in the research of the convolution operation,a boundary-aware convolution method is proposed to effectively obtain similarity characteristics from pixels of the same semantic label;in the research of multi-task joint optimization,this paper fully explores the correlation between different data sources and different target tasks,and effectively improves the accuracy of the method.The main work and contributions of this dissertation are as follows:1.A semantic prior information guided depth completion method is proposed.Facing the problem of depth completion from sparse lidar data in the road environment,the method claims that it makes full use of the semantic prior information existing in the image space to target the ground and obstacles by observing the distribution of depth information.The method proposes different asymmetric multi-scale convolution structures to well adapt the depth distribution in different semantic regions along the vertical and horizontal directions.A cascaded network structure is designed.The semantic segmentation sub-network is based on image-lidar combined input.A depth-completion network capable of fusing the semantic prior is proposed to improve the accuracy of depth estimation.2.A boundary-aware semantic segmentation method is proposed.Aiming at the problem of semantic segmentation in the road environment,this method proposes to design a special convolution method for different semantic regions.A special boundary-aware convolution structure,which is used to effectively fuse feature information by giving different weights to different neurons in the receptive field,is proposed.The overall network architecture effectively combines multi-scale boundary information with the full convolutional neural network,and constructs an improved semantic segmentation model based on the proposed boundary-aware convolution.Experimental results show that the proposed method effectively improves the accuracy of semantic segmentation.3.A joint solution of semantic segmentation and deep complementation based on boundary guidance is proposed.Different from the asymmetric multi-scale convolution structure proposed in the previous chapter,the joint multi-task network model is constructed by converging two sub-tasks with boundary information,and the boundary is used as the weak supervised information for both deep completion and semantic segmentation.A network architecture of single-encoder,multibranch-decoder is designed.It simultaneously performs the boundary detection,semantic segmentation and depth completion tasks.The joint optimization loss functions between multi-tasks are constructed to optimize the overall network performance.The experimental results show that the multi-task design framework of this paper effectively improves the accuracy compared to any single task.
Keywords/Search Tags:Road Environment, Semantic Segmentation, Depth Estimation, Boundary Detection, Deep Neural Network, Convolutional Structure Design, Multi-task Learning
PDF Full Text Request
Related items