| Visual Simultaneous Localization and Mapping(VSLAM)is currently a system for intelligent mobile robots to obtain location information through their own sensors in unfamiliar environments to achieve precise positioning.However,the traditional VSLAM has a strong rigid assumption,that is,the external environment is assumed to be static,so the positioning accuracy and robustness of the system will be disturbed by dynamic objects in the environment.A clear and effective environmental map cannot be obtained,which seriously interferes with application technologies such as human-computer interaction.In order to improve the positioning accuracy and mapping ability of the robot in indoor dynamic scenes,this paper proposes to integrate the lightweight target detection network into the VSLAM system,the specific research contents are as follows:(1)Which in view of the fact that YOLOv5,the current mainstream target detection algorithm,is still difficult to detect in real time on the CPU platform,this paper uses MobileNetV3,ShuffleNetV2,and PP-LCNet modules to make lightweight improvements to the backbone network of YOLOv5 and train it on COCO2017 and self-made data sets.Through experiments and comparisons select the model that achieves the best balance of accuracy and speed.The final results show that the improved YOLOv5-PP based on the PP-LCNet module has the best effect,and the detection speed is increased by 53.84% under the condition that m AP@0.5 is only reduced by 2.91%.(2)Aiming at the poor positioning accuracy of traditional SLAM in a dynamic environment,this paper proposes a dynamic and static frame mechanism and a dynamic area detection algorithm that improves multi-view geometry fusion.A new target detection thread is added,and indoor objects are divided into three categories:dynamic,static,and potentially moving objects.YOLOv5s-PP is used to perform target detection on the input RGB image and establish semantic labels of different categories.A dynamic point culling method based on grid area division is proposed,and the time consumption of the improved multi-view geometry is 86ms/frame.The strategy after integrating deep learning and pure geometric algorithm has been verified by experiments on the TUM data set.Compared with ORB-SLAM3,the pose accuracy of the algorithm in this paper has increased by 84.74% in high dynamic data sets.Compared with other dynamic SLAM algorithms,it also has partial lift.(3)Aiming at the defects of blurring and afterimages in traditional point cloud mapping due to dynamic objects,this paper uses dynamic detection frames and geometric vision strategies to remove image information during the mapping stage.According to the point clouds of the key frames in the system,the corresponding pose point clouds are spliced,and the dense point cloud maps with static backgrounds are constructed for TUM high dynamic data sets,actual simple,and complex scenes,respectively.After comparison,this system can effectively improve the positioning accuracy and the readability of building maps in dynamic environments. |