| With the development of technology and the progress of the times,various types of sensors are increasingly used in unmanned systems,and the fusion of information from multiple sensors helps to improve the perception of the surrounding environment.This dissertation is dedicated to the research of effective Li DAR-Camera Alignment(LCA)technique with the point cloud data captured by LIDAR and the color image data captured by cameras.Li DARCamera alignment aims to find the correspondence between point clouds and color images to estimate the transformation relationship between Li DAR and camera coordinate systems,and finally to align the Li DAR with the camera.However,there are modality gaps,data noise,holes,low-texture structures and other interferences between point clouds and color images,which bring great disturbances to the high-precision Li DAR-Camera alignment.In order to effectively overcome the above problems,the researches in this dissertation have been conducted in the following areas:There are huge modality gaps between point clouds and color images.In order to improve the resistance of feature extraction networks to such modality gaps,this dissertation proposes a feature learning method based on modality adaptation.Point clouds and color images are transformed into the same depth image domain and unified representation learning is performed.Meanwhile,in order to better capture the global information of the scene,this dissertation introduces Vi T(Vision Transformer)into the LCA task and proposes the concept of LCA token for aggregating feature patterns in local space to form a global spatial feature representation.Thanks to the strong representation power of LCA token,this method achieves the state-of-the-art alignment performance and robustness on the KITTI database.Point cloud density has a large impact on the performance of existing learning-based LCA networks.In order to improve the robustness of our methods to point cloud density variation,this dissertation converts the input point clouds to regular depth images,thus reducing the impact caused by point cloud density variations.Meanwhile,this dissertation proposes a novel loss function that enables the network to be trained without the ground truth.The experimental results show that using the regular depth images proposed in this dissertation can significantly improve the robustness of the existing LCA methods over point cloud density variations.Current LCA networks use separate network structures for feature extraction and iterative optimization,which greatly increases the overall parameters number of the algorithm.In order to reduce the redundancy of the algorithm,this dissertation fuses the point clouds with the color images and uses a unified convolutional neural network to extract the features,while sharing the convolutional neural network in each iteration.In this way,the complementary information between the point clouds and the color images are fully exploited in the feature extraction process,and the number of parameters required for the whole algorithm is reduced.In addition,a new loss function is designed in this dissertation to fully exploit the correlation between the translational and rotational terms.The experimental results show that the size of the network of this method is significantly smaller than that of existing methods,and the alignment performance can also reach the current advanced level.Most learning-based LCA methods directly regress the transformation parameters between different sensors,which makes the whole network like a black box and makes it difficult for researchers to clearly understand the basis of the predicted results.To enhance the interpretability of the learning-based methods,this dissertation incorporates geometric constraints between point clouds and colors images into the network structure,characterizes the geometric correspondence between them by matching flow,and solves the transformation parameters between Li DAR and camera based on the matching flow.The method in this dissertation achieves the state-of-the-art performance,and also clearly shows the basis of the prediction results,so that the whole leanring-based LCA methods has strong interpretability. |