Font Size: a A A

Research On Key Technologies Of Campus Navigation Based On Deep Visual Odometry

Posted on:2022-01-18Degree:MasterType:Thesis
Country:ChinaCandidate:C QinFull Text:PDF
GTID:2518306572458524Subject:Design
Abstract/Summary:PDF Full Text Request
The technology of Simultaneous Localization And Mapping(SLAM)is an important branch of computer vision,which is of great significance to the realization of various specific applications such as automatic path planning,robot localization and the navigation of self-driving cars.Moreover,visual odometry plays an important role in the SLAM system and has attracted more and more attention from researchers due to its low price and easy portability,which has a very broad development prospect.Traditional methods require complex operation processes such as feature extraction and stereo matching after scanning the surrounding environment with expensive equipment,then the camera localization and navigation are carried out when the complete scene is reconstructed,which has many inconveniences for the campus scene.First,most of the campus scene is in the outdoor environment and the scanning results can be easily affected by weather.Second,the campus scene is dynamic and it’s hard to determine the content in current scene through the results of previous scans.However,the Deep Visual Odometry(DVO)method based on deep learning only relies on image information and is not restricted by external environmental factors,which can estimate the position and posture of camera as well as the depth of the monocular image during the motion of camera to achieve localization and navigation,leading the trends of current SLAM research.This paper studies the key technology of campus navigation and optimizes the existing visual odometry module based on deep learning,which greatly reduces the running time and improves the estimation accuracy of the position and posture of camera and the depth of scene.The main contents of this article are as follows.First,an end-to-end deep visual odometry method based on the monocular camera is proposed to simultaneously estimate the depth of the image and the position and posture of camera,relying on the photometric error as a constraint and jointly optimize the depth estimation network and the pose estimation network in an unsupervised manner.Second,the calculation of the photometric error is based on the assumption of gray scale invariant,which is not satisfied in real scenes.As a consequence,an Image Alignment(IA)module is designed to deal with the change of gray scale caused by the exposure between different images and determine the global motion scale.Meanwhile,a super resolution network is incorporated into the depth estimation module instead of using a simple interpolation operation for upsampling in order to deal with the holes caused by the inaccurate estimation in the large non-texture area,which can solve the problem of scale inconsistent of monocular camera with the combination of the global motion scale to a certain extent.Then,a regularization term estimating from the uncertainty estimation network(U-CNN)is introduced to reduce the influence of moving objects and occluded areas,which achieves robust estimation result.Moreover,a depth map from horizontal flipping image are merged with the normal calculation ones,which results a more robust depth estimation.Finally,the method proposed in this paper is named AUDVO(Aligned U-CNN Deep VO)and the evaluation results on the public KITTI dataset demonstrate the effectiveness of AUDVO for robust single-view depth estimation and visual odometry,which can guarantee an accurate depth estimation result at the edge of the object and achieve the same level of traditional monocular VO.
Keywords/Search Tags:Deep Learning, Visual Odometry, Uncertainty Estimation Network, Super Resolution Network
PDF Full Text Request
Related items