| Homography is defined as the mapping relationship between images from two different planar viewpoints in 3D space.Homography estimation is an important step in many computer vision applications and has been extensively studied in the past few decades.In recent years,due to the development of deep learning technology,the homography estimation methods based on deep learning have been gradually proposed.However,most of the current homography estimation methods regard the homography estimation problem as a supervised learning task,and the training of network requires ground truth labels.Unsupervised homography estimation methods have the problem of low estimation accuracy.In addition,the current methods use convolutional neural networks in the feature extraction process to extract the features of images,which are suitable for images with small displacement changes between corresponding points.When the displacement change increases,the error of homography estimation will greatly increase.In view of the above problems,the main research contents of this paper are as follows:First,most of the current homography estimation methods based on deep learning are supervised methods,which are highly dependent on real label data,and the unsupervised methods have low accuracy.An unsupervised homography estimation method based on cascade CNN is proposed.The network of this method estimates a part of the whole homography in the first stage,and the subsequent stages of the network are trained on the output of the previous stage to produce smaller homography residuals.Each stage reduces the boundary range of the error in turn,and the homography estimation is carried out from coarse to fine.In addition,the difference between pixel values is used as the loss function,which does not require ground truth labels.Tests on MS-COCO dataset show the effectiveness of the proposed method.Second,we propose an unsupervised multi-scale and multi-stage content-aware homography estimation method for image registration under large disparity,which solves the problem that the homography estimation error increases significantly due to the narrow receptive field of convolution.Firstly,Self-Attention augmented Conv Net feature extraction method is used,self-attention mechanism is introduced,and local and global information is considered for the extracted features.Secondly,a feature matching module is added to the homography estimation network to significantly strengthen the matching relationship between features.In addition,images of different resolutions are used as input in the multi-stage network.The low-resolution images are used to estimate the large-scale and global homography transformation,and the high-resolution images are used to estimate the small-scale and local homography transformation.The homography transformation relationship between the images is gradually refined from coarse to fine.The experimental results show that the proposed method achieves the best performance,and the accuracy is superior to other methods in the case of large displacement changes between image corresponding points. |