| In recent years,with the continuous development of artificial intelligence technology,the role of intelligent mobile robots in human production and life has been increasingly enhanced,and it can replace humans in performing various tasks such as patrolling,surveying,transportation,and rescue.Simultaneous Localization and Mapping(SLAM)is one of the key technologies in the field of intelligent mobile robots,that is,the robot performs positioning and incremental mapping from an unknown environment through the data stream obtained by its own sensors.Most SLAM methods assume that environments are static.Such a strong assumption limits the application of most visual SLAM systems.The dynamic objects will cause many wrong data associations during the SLAM process,which will cause the system to fail to locate and map accurately.To solve this problem,this paper proposes a visual SLAM system for dynamic scenes building on the ORB-SLAM2.(1)a Grid-based Motion Statistics algorithm(GMS)is used in the system,which mainly utilizes the coherence constraint between adjacent pixels and encapsulates this constraint into statistics likelihood between region pairs characteristic.The difference in the distribution of the statistical characteristics of true and false matches are utilized to remove the incorrect matching points on the dynamic object,with score based thresholding;(2)a sliding window model is added to the system initialization process to establish a static initialization 3D map that does not contains feature points of dynamic objects.Due to the high frame rate of the camera,the motion amplitude of dynamic objects between adjacent frames is not obvious and cannot be effectively removed by GMS alone.So we use sliding window model to perform feature matching between the first frame and the nth frame image during the initialization process(n represents the size of the sliding window);(3)for the RGB-D case,we added a key frame-based static dense map construction thread.In this thread,the YOLO v3 object detection network and dynamic feature extraction algorithm are used to accurately remove the dynamic regions in the keyframes.Finally,the RGB and depth information of the static regions are used to establish the static map online.This paper conducts comparative experiments on the public data sets of TUM and KITTI,which contains three kinds of sensor data of monocular,binocular and RGB-D.We tested in indoor low dynamic scenes,high dynamic scenes and outdoor large scenes,mainly using two quantitative indicators of absolute trajectory error(ATE)and relative pose error(RPE),from the absolute accuracy of the trajectory and the degree of drift to evaluate the system positioning results.In addition,we also conducted a static point cloud map reconstruction experiment for indoor high dynamic scenes.Experimental results indicate that the proposed SLAM algorithm for dynamic scenes fully eliminates the influence of dynamic objects,and the absolute trajectory accuracy in indoor high dynamic scenes is improved by 96.06% compared with traditional methods,and it can generate static point cloud map without dynamic objects online,which have high information reuse value. |