| Major human activities benefit from the fact that humans have the ability to perceive their surroundings,and vision is one of the main channels of human perception.With the development of robotics,there is an increasing need for robots to be able to sense their environment through sensors and to interact with it autonomously.Simultaneous Localization And Mapping(SLAM)technology was developed to solve this problem.The core idea of SLAM is to allow a robot to collect data during its own activities,to analyse its surroundings,to infer its own trajectory,and to model the external environment in relation to its position.The external environment is modelled and a map is created which can be re-used later.The key problem in SLAM technology is the pose solving problem,which is the front-end part of the whole SLAM system and is responsible for calculating the continuous pose change itself based on the front and back frame data of the sensor during the motion.In this thesis,the pose solution problem is investigated,and the main research content includes two points:(1)firstly,the homography matrix pose solution method based on the RANSAC method commonly used in visual SLAM is studied,and through theoretical analysis and experimental comparison on the dataset,it is shown that its inaccurate feature point selection due to different scene depths brings about information loss and feature point distribution problems,and based on the above Unlike the RANSAC algorithm-based pose solving method which passively selects feature points in the same plane,the multi-planar segmentation-based pose solving method actively divides different depth planes through planar instance segmentation,and then performs pose estimation separately.The effectiveness of the multi-planar segmentation method in terms of feature point utilization,effective interior point distribution range and overall error reduction is experimentally verified.(2)Secondly,to address the problem of insufficient number of feature point detection and poor matching effect caused by small area of effective plane pairs and few textures after plane segmentation by the multi-plane segmentation method,a homography estimation network based on deep learning is designed to directly estimate the homography matrix between the front and back frames through the deep learning method to avoid the influence of feature points.The main problem to be solved in the design of the network model is that after multi-plane segmentation,the effective content area of the image has a large difference in size distribution and the position is more random.Therefore,compared with the common homography network model,the network in this thesis designs an explicit attention mechanism,adds a segmentation prediction branch for extracting the position of the effective area in the image,helps the homography estimation branch to improve the homography estimation through the shared structure The network is also designed to solve the problem of large oscillations in the training process and poor final prediction accuracy of common network models on data generated after multiplanar segmentation.The superiority of the network structure of this thesis in terms of performance is demonstrated through comparative experiments,and ablation experiments are designed to prove the effectiveness of the method proposed in this thesis. |