| In binocular vision,the camera is used as the image acquisition equipment,the image is used as the input,and then the stereo matching method is used to obtain the 3D depth information from the input image.Because of its low cost and easy operation,it has been widely used in the automatic driving and other fields.The core of binocular vision is stereo matching.Stereo matching is to calculate the disparity pixel by pixel on the corresponding matching points.With the development of convolutional neural network,it has achieved ideal results in solving binocular stereo matching problems.Although the convolutional neural network has improved the matching effect,there is still a large matching error in the ill-posed regions,such as reflective regions and small objects in the real scene.And the matching effect is still not ideal.To improve the disparity accuracy,we studied PSM-Net,and analyzed the feature extracted module and disparity calculation module.The details are as follows:In this paper,in order to improve the feature extraction ability of the network on the basis of PSM-Net,the attention mechanism and residual network are used to construct the feature extraction module,and the stereo matching algorithm based on the attention mechanism is proposed.The feature extraction module uses the attention mechanism to model the information contained in channels and the relationship between channels.Considering the different network layers when features are extracted,such as the facts that shallow network is more focused on the edge of the image information,such as structure and deep is more focused on the abstract semantic characteristics,feature extraction module is presented in this paper using skip connections to reduce shallow characteristics in the process of transmission loss,which is helpful to improve the matching accuracy while ensuring the adequacy of feature extraction.In view of the problem that the end-to-end stereo matching algorithm based on deep learning needs to introduce disparity dimension when constructing the matching cost,which requires three-dimensional convolution calculation and the sharp increase of network parameters,the stereo matching algorithm based on separable convolution is proposed in this paper.This algorithm is based on the proposed stereo matching algorithm on the basis of attention mechanism.In the stage of disparity calculation,the deep separable 3D convolution is used to replace the matching cost of ordinary 3D convolution processing.In addition,considering that convolution cores of different scales have different receptive fields,this paper also designs a context information extraction module that can extract multi-scale information to improve the matching accuracy of the network and ensure that small objects in the scene will not be filtered out after convolution.Finally,in this paper,the proposed model is quantitatively analyzed based on the Scene Flow dataset and KITTI Stereo dataset commonly used in the research field,and the visualization results are analyzed.The results show that the proposed algorithm has low matching error,especially in ill-posed regions such as reflective regions and small objects. |