| 3D reconstruction is the process of reconstructing 3D information of objects based on the images of 3D objects using computer vision technology.It is widely used in the fields of cultural relics reconstruction,medical image processing,automatic driving,3D reconstruction of large scenes reconstruction,and so on.The traditional 3D reconstruction algorithm of multi-view stereo(MVS)has high reconstruction accuracy,but in the scene with drastic changes in lighting conditions and lack of texture,the depth estimation is incomplete and the reconstruction cost is high.With the cost of image acquisition equipment becoming lower and lower,in recent years,deep learning algorithms that require a large number of training data sets have been widely used in 3D reconstruction.However,the prediction accuracy and efficiency of most MVS networks will be greatly affected by the depth assumption,and it is difficult to achieve a good balance between the amount of calculation and accuracy in the scenario with a large depth range.How to improve the accuracy and completeness of depth estimation without increasing the amount of computation is still a problem worth exploring.Based on this,this paper focuses on the 3D reconstruction of multi-view stereo vision based on deep learning,and carries out the following research:(1)A 3D reconstruction algorithm based on multi-head self-attention mechanism is proposed.Combining the baseline network with multi-head self-attention mechanism is helpful for the network to effectively and completely recover the 3D structure information of the image.At the same time of multi-scale feature extraction,parameters are shared,global information is fused,higher dimensional information is captured,and the completeness of reconstruction is improved.In order to reduce the problems caused by multi-layer convolution and multi-head self-attention mechanism,such as the increase of parameters,the increase of computation amount,and the slow running speed,a more lightweight model is obtained by reducing the number of layers of feature extraction to reduce the occupation of video memory and speed up the running time.The experimental results show that the completeness error of the algorithm in the test of DTU dataset is 0.271 mm,which is 0.006 mm less than the baseline network.It effectively improves the completeness of the reconstruction,reduces the number of layers of feature extraction,and effectively reduces the reconstruction time.At the same time,the full reconstruction on the large scene dataset proves that the method has good robustness and generalization ability.(2)A 3D reconstruction algorithm combining sparse image initialization and loss optimization is proposed.The corresponding improvement scheme is proposed for the random initialization method of depth map and the single loss function.The reconstruction accuracy is improved by using the traditional method to initialize the depth map,and the loss function is combined to reduce the noise and solve the problem of uneven edges.Through the above improved algorithm,combined with multi-scale feature extraction network based on multi-head self-attention mechanism,it can improve the accuracy and completeness of the reconstruction model and obtain a lighter network structure,faster training speed and less video memory occupation.The experiment shows that the accuracy error of the point cloud reconstructed by the improved algorithm in the test of the DTU data set is 0.384 mm,which is 10.1% higher than the baseline network,significantly improving the accuracy of reconstruction,with less noise and smoother edge processing.(3)Completed the development of 3D reconstruction application system based on MVS.Based on the protection and inheritance of bronzes,a 3D reconstruction system of bronzes is designed and implemented,which is simple and convenient to operate,so that non-professionals can easily realize the 3D reconstruction of bronzes. |