Font Size: a A A

Research On 3D Reconstruction Of Outdoor Scene

Posted on:2024-03-03Degree:MasterType:Thesis
Country:ChinaCandidate:J Y KangFull Text:PDF
GTID:2558306917970489Subject:Software engineering
Abstract/Summary:PDF Full Text Request
The carrier of human information has undergone tremendous changes,from murals to images,to virtual reality.Three-dimensional information of objects can provide more visual perception,and this information has great demand and broad application prospects.3D Reconstruction algorithms in computer vision can efficiently restore the three-dimensional models of scenes and objects,with the advantages of low cost,easy acquisition,and low time cost,thus causing extensive research.Traditional reconstruction methods recovers 3D models through the geometric relationship of feature points in the scene,thus achieving a certain degree of accuracy advantage in model reconstruction.However,in outdoor scenes,visual changes are more pronounced,and the extracted feature points for repeatable detection have a large error.Besides,in outdoor dynamic scenes,humans are the most important components and their reconstruction is crucial.However,due to their non-rigid movement,their inherent geometric relationships are destroyed,making reconstruction difficult.Thus,we aim to solve the problems encountered in 3D reconstruction in outdoor dynamic scenes by combining traditional algorithms and learning-based algorithms.Firstly,in static outdoor scenes,we solve the problem of inaccurate reconstruction results caused by feature point errors by combining a learning-based feature extraction network.Secondly,in outdoor dynamic scenes,we treat the slow movement of objects as rigid movement and use the geometric consistency between frames for depth estimation of video sequences.Finally,for severe non-rigid movements,we use a combination of optical flow information and distortion functions to reconstruct dynamic human bodies in 3D,and ultimately reconstruct the entire outdoor scene.The specific work is summarized as follows:(1)3D reconstruction for outdoor static scenes.In static scenes,traditional methods restore scene models based on strong geometric constraints through the geometric relationship between feature points.In order to extract reliable feature points for repeated detection,a depth feature extraction network is combined to generate descriptors for images with sparse feature points,which are then matched with dense depth features of matching images.Then,the initial matching feature points are refined,and the position of key points is adjusted by minimizing the feature measurement error.After that,the same depth feature measurement is used to optimize bundle adjustment,adjust the camera pose,and finally use dense reconstruction algorithm to generate a more accurate three-dimensional model of the scene.(2)3D reconstruction for outdoor dynamic scenes.Firstly,a multi-view stereo method is used to estimate the single-frame depth map of the scene.Then,a segmentation network is used to remove dynamic objects,and the static scene depth map is estimated using motion parallax.The consistency of the depth maps estimated by the two methods is used to obtain a relatively accurate depth map,which is used as network supervision.At the same time,the method in the previous chapter is used for sparse reconstruction of the static scene to obtain more accurate camera pose.The position error between the points estimated by the dense optical flow field and the depth-reprojected points is used as inter-frame geometric consistency to encourage the network to generate more consistent depth maps.(3)3D Reconstruction for Outdoor Dynamic Non-rigid Bodies.To achieve consistent 3D reconstruction of non-rigid human bodies,a warping function is first used to transform the depth map to another time frame,while the depth estimation network is used to estimate the depth map of that time frame.The consistency between the estimated depth map and the warped depth map is used as the loss function to train the network.Meanwhile,the motion of the object is estimated using an optical flow network,and the distance between the warped depth map points and the optical flow is used as the loss function to fine-tune the network,generating a network for depth estimation of a specific video.The depth map is then converted into a 3D model.During the depth estimation,the surface normal and depth are estimated using a normal encoding-decoding network and a depth encoding-decoding network.Then,the geometric relationship between depth and surface normal is used to supervise the normal encoding network and the depth estimation network with more accurate surface normal.The proposed method is evaluated on various datasets,including outdoor scene images acquired by the authors and compared against state-of-the-art 3D reconstruction algorithms.Both quantitative and qualitative evaluation experiments demonstrate the effectiveness of the proposed method,which utilizes deep learning techniques and geometric constraints to address challenges associated with outdoor scene reconstruction and promote the application of 3D reconstruction in real-world scenarios.
Keywords/Search Tags:3D Reconstruction, Structure From Motion, Dynamic Scene Reconstruction, Non-Rigid Human Reconstruction
PDF Full Text Request
Related items