Font Size: a A A

Research On 3D Object Detection In Intelligent Construction Based On Computer Vision

Posted on:2024-01-14Degree:MasterType:Thesis
Country:ChinaCandidate:Y G WangFull Text:PDF
GTID:2542306917970469Subject:Software engineering
Abstract/Summary:PDF Full Text Request
Object detection technology plays a key role in computer vision.It can identify the location and category of objects in images,and is widely used in areas such as unmanned driving,smart transportation,intelligent construction,and smart manufacturing.However,the current object detection technology in intelligent construction is limited to 2D object detection,while applications such as unmanned construction,collision detection,and construction worker safety detection need to obtain the three-dimensional position information of the object.In addition,existing computer vision tasks only provide local visual understanding,and lack of unified management of computer vision detection results,resulting in a lack of global visual understanding and intelligent construction applications in the process of intelligent construction,resulting in a waste of data resources.In order to solve the above problems,based on computer vision tasks,this paper proposes to construct an outdoor 3D dynamic scene for construction sites by in-depth analysis of the needs of intelligent construction of construction projects,combined with computer vision tasks such as scene graph tasks,object detection,image segmentation,and 3D reconstruction.Figure method.Focus on the research of monocular 3D object detection algorithm and apply it to the field of intelligent construction.The specific research contents include the following aspects:(1)Outdoor 3D dynamic scene graph and its construction for intelligent construction.Aiming at the current lack of 3D scene understanding models under outdoor construction projects in intelligent construction,this paper follows the 3D scene graph paradigm and builds an outdoor operable space for intelligent construction based on network models of different task categories and abstract levels.Perceptual Unified Representation:Outdoor 3D Dynamic Scene Graph(ODSG).ODSG divides the outdoor construction site into different levels and fine-grained unified representations,and models its spatial relationship.While managing the construction site data in a unified manner,it can adapt to different types of construction tasks,and this paper uses simulation and experimental results The validity of the model is verified.(2)Monocular 3D object detection based on multi-scale pyramid feature fusion network.The large difference in object scale is an important challenge for monocular 3D object detection.Usually,the method to solve the large difference in object scale is multi-scale feature fusion.However,most current multi-scale feature fusion methods use a single-scale up-and-down sampling pyramid network to fuse features of different scales,ignoring the influence of noise in the feature map and the non-smooth phenomenon of feature fusion,and not making full use of the feature information in the multi-scale feature map.In view of the above problems,this paper designs a multi-scale pyramid feature fusion network,which is composed of an improved DLA34 and a multi-scale pyramid network,which can achieve smooth and full fusion of multi-scale features,while reducing the impact of noise on subsequent feature fusion.Experimental results on the KITTI open source dataset show that compared with other methods based on multi-scale fusion,the method proposed in this paper achieves the best average accuracy of 3D detection and BEV detection at simple,medium and difficult detection levels.(3)Research on DETR3D object detection based on fusion depth and significance information.Most of the existing monocular 3D object detection algorithms combine geometric relationships and convolutional neural networks to predict the 3D attributes of objects,and lack deep feature information and feature global relationships.Aiming at these problems,a DETR(Detection TRanformer)monocular 3D object detection algorithm that combines depth and saliency information is designed,a lightweight unsupervised depth module is constructed to extract object depth feature information,and a Transformer model is introduced to obtain the global relationship of features.In addition,to solve the problem of high computational cost of the Transformer model in the algorithm,a saliency network is designed to reduce the computational load of the Transformer encoder.The experimental results in the KITTI official data set show that compared with other current advanced detection algorithms,the proposed algorithm achieves the best in multiple indicators of detection accuracy,and the effectiveness of each module in the algorithm is proved by ablation experiments.In summary,this paper demonstrates the effectiveness of the ODSG model proposed in this paper and the two monocular 3D object detection methods through the simulation experimental results of the construction scene dataset and the experimental results on the KITTI open source dataset.
Keywords/Search Tags:Computer vision, Intelligent construction, Monocular 3D object detection, Transformer, Multi-scale feature
PDF Full Text Request
Related items