| With the development of artificial intelligence,the requirements for sensors in commercial applications are also increasing.In terms of visual sensors,there are two kinds of application.One is visual positioning.The positioning technologies represented by visual odometers and visual SLAMs are widely used in autonomous driving,robotics,and industrial automation.The second is three-dimensional information acquisition.The dense reconstruction method represented by vision belongs to the core algorithms in the fields of robot scene understanding,face reconstruction,3D printing,and smart cities.The combination of these two algorithms is to restore the structure from motion,which is an important branch of 3D reconstruction.In recent years,it has been highly valued by the academia and industry.In this paper,the traditional pose estimation and multi-view stereo vision are studied in depth,and the existing algorithms are improved by combining the multi-view geometry principles in computer vision.First,based on the existing stereo calibration algorithm,a multi-camera calibration algorithm based on minimum spanning tree is implemented.In order to reduce the dependence on the number of feature points and solve the problem of insufficient point features in pose estimation,a binocular vision pose estimation algorithm based on affine correspondence is proposed.Based on this pose estimation algorithm,a visual odometry framework is implemented.In order to effectively utilize the effective information of multi-view and solve the ambiguity problem of binocular stereo matching,a multi-view stereo matching algorithm based on confidence map is proposed.Finally,a multi-camera calibration program and an affine correspondence based structure from motion program are implemented.For multi-camera systems,the system and method perform initial pose estimation based on the minimum spanning tree,and implement the non-linear optimization method to optimize all camera parameters.The structure from motion algorithm based on affine correspondence effectively uses neighborhood information,improves the accuracy of pose,and avoids the problem of trajectory drift in traditional feature points estimation methods in regions with insufficient features.The research contents of this paper are mainly divided into multi-camera calibration,visual odometer based on affine correspondence,multi-view stereo matching.(1)Multi-camera calibration based on minimum spanning tree and stereo calibration optimization method.This paper uses stereo calibration to estimate the pose relationship between any two cameras,and constructs a pose map.The proposed method uses the minimum spanning tree algorithm to obtain a minimally connected subgraph,that is,each camera has only a unique pose.The non-linear optimization method is used to optimize the parameters of multiple cameras in 3D space by minimizing reprojection errors,so that the inherent parameters of the camera can be obtained quickly and accurately,which lays the foundation for subsequent stereo visual odometry and multi-view stereo matching algorithms.(2)Visual odometry method based on affine correspondence and feature selection.a novel algorithm to estimate the absolute camera pose is proposed using two affine correspondences(ACs).Exploring the relationship between the affine transformation and the projection equation,six linear constraints are derived,and only two ACs are sufficient to recover the pose.Even though perspective cameras are assumed,the constraints can straightforwardly be generalized to other camera models since they describe the relationship between local affinities and projection.Benefiting from the requirement of less correspondences,the proposed algorithm needs less sampling times when robust estimators like RANSAC are applied,and still performs stably with rather limited number of correspondences.For the improvement of robustness,the affine transformation is further optimized via photometric and epipolar constraints.The proposed method was validated on both synthetic and real-world datasets,which demonstrates that the proposed method yields results superior to the state-of-the-art in terms of accuracy.When implementing the visual odometry framework,this paper proposes a cyclic feature matching method based on non-maximum suppression.Feature extraction is performed using non-maximum suppression methods for feature selection,and chained feature matching is performed between multiple frames during feature matching.The proposed method improves the matching accuracy,solve the pose error caused by potential redundant feature points and ambiguity matching,and finally use the bundle adjustment method to optimize the pose to improve the accuracy of pose estimation in a short time.(3)Multi-view stereo matching based on confidence map and projection consistency.Binocular stereo matching usually has matching ambiguity,and there will be a large number of holes after the consistency check.This paper uses the binocular semi global matching algorithm to calculate the confidence map of the disparity map.Based on the visual odometry or multi-camera calibration,the pose relationship between adjacent frames is obtained,the re-projection method is used to calculate the matching cost for multiple moments,and the matching cost is fused based on the confidence map to calculate the disparity map.Aiming at the problem of a large number of redundant spatial points and noises in the fusion of multi-view depth maps,a point cloud filtering algorithm based on projection consistency is proposed in this paper,which effectively reduces the invalid data in the fused point cloud. |