| With the development of technology,the demand for intelligence is creasing.Simultaneous Localization And Mapping(SLAM)technology,which mainly solves the problems of self-positioning and environmental perception,has been widely used in many fields such as Automatic Driving,Augmented Reality,Robot Navigation and other important scenes.However,most of the current visual SLAM technologies perform positioning and map construction at the pixel or feature point level.In real environmental scenes,there is a large amount of object information in addition to these geometric features.Integrating these object information into the visual SLAM technology can help machine to understand the spatial structure of the environmental scenes and make machine further perceive the content information in the surrounding environment,which is an important development direction of visual SLAM technology.This thesis designs and implements a Visual-Inertial Monocular 3D Object SLAM system.Combined with 3D object detection algorithm and visual-inertial fusion algorithm,the system realizes large-scale system positioning and sparse point cloud map construction with object labels on the basis of visual SLAM system.In this thesis,rich experiments are carried out to analyze each module algorithm of the system,and the feasibility and effectiveness of the system are verified.The main contributions completed in this thesis are as follows:1.In the front-end module of the visual SLAM system,3D object detection algorithms are incorporated to increase the system’s scene understanding capabilities.At the same time,through the visual-inertial and 3D object detection fusion algorithm,a visual-inertial odometry based on 3D object detection is designed and realized,which improves the robustness of the whole system and the accuracy of 3D object detection.2.In the back-end module of the visual SLAM system,according to the characteristics of the system designed in this thesis,the design and implementation of the visual-inertial and 3D object joint optimization algorithm has improved the robustness and accuracy of the system,and further deepened the understanding of the scene.The construction of point cloud maps with object tags and large-scale system positioning are realized.3.Based on the visual-inertial data and object information in the system,the data association strategy between frames is designed and realized,which improves the interference of dynamic objects and object occlusion in complex environment scenes to the system. |