Font Size: a A A

Design And Implementation Of Visual Odometry System Based On End-to-end Learning

Posted on:2022-06-29Degree:MasterType:Thesis
Country:ChinaCandidate:H QinFull Text:PDF
GTID:2558307154475914Subject:Information and Communication Engineering
Abstract/Summary:PDF Full Text Request
As an important part of the visual simultaneous localization and mapping(SLAM)system,the visual odometry(VO)can be used to estimate the real-time position of smart mobile devices such as robots or cars,with the image sequences captured by the visual sensors(cameras).Recently,with the gradual maturity of related technologies in computer vision,the technologies of VO system have been also developed,and are currently widely used in virtual reality,autonomous driving,intelligent robots,industrial Internet and so on,which makes VO system receive wide attention from research scholars.Recently,after a thorough investigation of the current research schemes about learningbased methods of VO systems,this paper has completed the following tasks:Firstly,an algorithm framework of VO system based on context-gated convolution and subspace attention mechanism is proposed.The key parts of the algorithm include:(1)To extract more representative image feature maps,the context-gated convolution is used to replace the convolutional neural network,which can modify the scale of the convolution kernel dynamically,with the global information of the entire images;(2)A lightweight subspace attention mechanism module is designed to divide the image feature map into multiple subspaces in the channel dimension to further enrich the image feature space,and calculate the corresponding attention maps respectively,and assign different weights according to the importance to ensure the image feature maps can contain more key information.The proposed VO system adopts a lightweight method,which shows good experimental results on a large number of public datasets KITTI,EuRoc and self-collected datasets.Compared with the VO method based solely on the convolutional neural network,the average translation error of the proposed VO system on the KITTI dataset is reduced by 45.6%,and the average rotation error is reduced by 35.2%.Secondly,in the research,it is found that the performance of the VO system based on end-to-end learning is mainly limited by the network framework.Therefore,the algorithm is further improved on the framework of the convolutional neural network.And the VO system method based on multi-branch convolutional neural network and self-attention mechanism is proposed.The key parts of the algorithm include:(1)By expanding the original feature extraction network dimensions,we establish a multi-branch structure to capture key image feature maps across channels;(2)The self-attention layer is designed to capture the key features and the internal correlation between adjacent images,reducing the dependence on external information in the process of feature extraction,and enhances the learning ability of the VO system for long sample sequences;(3)An adaptive activation function based on FReLU is designed.While training the end-to-end VO model,the activation state of neurons can be adaptively adjusted according to the system input,so that the VO system can maintain good robustness under different scene conditions.A large number of experiments show that the accuracy of the proposed VO system has been significantly improved,and the time of single-frame pose estimation on the KITTI dataset is much shorter than the current mainstream VO methods.
Keywords/Search Tags:Visual Odometry, End-To-End Learning, Context-gated Convolution, Attention Mechanism
PDF Full Text Request
Related items