| Simultaneous Localization and Mapping(SLAM)are indispensable for mobile robots to achieve accurate localization in unknown environment.When the sensor used by the SLAM system is a camera,the system can be called a visual SLAM system.At present,most visual SLAM systems assume that the external environment is static,but this assumption cannot be satisfied in practical application scenarios.Aiming at the problem that traditional visual SLAM system is easily affected by moving targets in dynamic environment,which leads to the decrease of the positioning accuracy of the system,this thesis proposes to integrate the target detection technology of deep learning into visual SLAM system,which can effectively improve the positioning accuracy of visual SLAM system in dynamic environment,and obtain the semantic information of the environment.Since traditional deep learning networks have high requirements on hardware resources,this thesis constructs different lightweight target detection networks for different hardware platforms,so that the visual SLAM system based on deep learning can run in real time on low computing power and even embedded platforms.The specific research work is as follows:(1)Aiming at the problem that the current target detection network based on deep learning has too high hardware requirements and is difficult to run in real time,this thesis constructed a lightweight target detection network on CPU and GPU platform respectively.For CPU platform,by comparing common lightweight network,Mobile Net V3 lightweight network is proposed as the backbone network of target detection network YOLOv5 s,and finally a lightweight target detection network Mobile Net V3-YOLOv5 s is constructed,which reduces the number of network model parameters by 51.88%.The detection speed on CPU increased by 53.06%.For GPU platform,in order to give full play to the advantages of GPU parallel computing,this thesis proposes to use Tensor RT to accelerate the network fusion of YOLOv5 series.Experimental results show that compared with the original YOLOv5 s network,the video memory usage of the fused Tensor RT-YOLOv5 s network is reduced by39.84%.At the same time,the network inference speed has been improved by 12 times.Finally,in the actual road environment of campus,this thesis carries out a comparative test on the detection effect of YOLOv5 s,Mobile Net V3-YOLOv5 s,Tensor RT-YOLOv5 s.(2)Aiming at the problems of low localization accuracy and poor robustness of current visual SLAM systems in dynamic environments,this thesis proposes to integrate Mobile Net V3-YOLOv5 s lightweight target detection network and optical flow pyramid into the visual SLAM system based on the above research(1).Therefore,the front-end of SLAM system can effectively eliminate the dynamic feature points in the image while extracting the ORB feature points,and only the feature points on the static target are used for inter-frame matching to solve the camera pose,which improves the positioning accuracy of the system in the dynamic environment.The proposed algorithm is applied to dynamic environment,and the test on TUM dynamic data set shows that the pose estimation accuracy is improved by80.16% compared with ORB-SLAM3.Compared with other SLAM algorithms applicable to dynamic environment,the real-time performance and accuracy are both improved.In conclusion,the research results of this thesis can effectively reduce the impact of dynamic targets on the positioning accuracy of visual SLAM system,and the lightweight target detection network in the system can obtain semantic information in real time.It provides a new idea to integrate deep learning technology into the research and application of visual SLAM system,and has certain reference value for the application of visual SLAM technology on embedded platform. |