Font Size: a A A

Research On Optimization Of End-to-end Binocular Stereo Matching Algorithm Based On Convolutional Neural Network

Posted on:2023-04-06Degree:MasterType:Thesis
Country:ChinaCandidate:H F TangFull Text:PDF
GTID:2568306791956909Subject:Information and Communication Engineering
Abstract/Summary:PDF Full Text Request
The core task of binocular stereo vision is to calculate the pixel disparity between the corresponding matching points of the binocular image to obtain depth information.It has the advantages of simple configuration and high precision,so it has broad applications in the fields of autonomous driving,virtual reality and intelligent robots.prospect.At present,the end-to-end binocular stereo matching algorithm based on convolutional neural network has become a hot spot in the research of binocular stereo vision.In order to solve this problem,this paper optimizes the accuracy and real-time performance of the end-to-end binocular stereo matching algorithm based on convolutional neural network.For scenes with high precision requirements,an end-to-end stereo matching network with dense feature fusion is proposed.Firstly,a feature pyramid network is constructed by using multiple residual modules to fully capture multi-scale context information with fewer parameters.Then,the dense fusion module is used to expand the receptive field,and the information of different scales is effectively fused to obtain dense feature maps and reduce the mismatch rate caused by feature sparseness to complex regions.On this basis,a hybrid attention module is constructed to enhance the network’s ability to process useful information.The performance evaluation results show that the mismatch rate of the stereo matching network on the KITTI 2015 dataset is 2.23%,and the running time is 0.22 s.Compared with the high-precision algorithm PSMNet,the mismatch rate is lower and the running time is reduced by 46%.Effective Solve the problem of long running time of high precision algorithms.For scenarios with high real-time requirements,a progressively refined end-toend real-time stereo matching network is proposed.First,a lightweight feature pyramid network is designed to extract and fuse context information to generate feature outputs of three scales.Then,the information interaction on the space and the channel is enhanced through the hybrid attention module,and the generalization ability of the network is improved.Finally,the three-stage disparity map is output in a progressive refinement manner to further reduce the amount of parameters.The results of ablation experiments show that the stereo matching network has a false matching rate of 4.08%and a running time of 0.02 s on the KITTI 2015 dataset.Compared with the high realtime algorithm Any Net,the false matching rate is reduced by 2.43% and the running time is only increased by 0.3ms,which can effectively solve the problem of poor accuracy of high real-time algorithms.Aiming at scenarios that have requirements on both accuracy and speed,an endto-end stereo matching network with dual-channel feature fusion is proposed.Firstly,the dual-channel feature extraction network is used to fully extract shallow spatial information and deep semantic information,and the lightness is maintained through a fast downsampling strategy.Next,multi-scale feature fusion is performed to obtain three scale outputs containing rich spatial information and semantic information,which improves the matching effect of the edge area and the interior area of the object.Finally,the three-stage disparity map is output by means of progressive refinement.The results of ablation experiments show that the mismatch rate of the stereo matching network on the KITTI 2015 dataset is 2.32%,and the matching speed of the three stages is higher than 20 PFS.Compared with the progressively refined end-to-end real-time stereo matching network,the accuracy It is greatly improved,and the matching effect in the inner area and edge area of the object is better,which effectively solves the problem of difficult trade-off between high precision and high speed.
Keywords/Search Tags:binocular stereo vision, stereo matching, convolutional neural network, multi-scale contextual information, attention mechanism
PDF Full Text Request
Related items