Font Size: a A A

Research On Representing Local Image Features Based On Data-Driven Methods

Posted on:2019-04-03Degree:DoctorType:Dissertation
Country:ChinaCandidate:D L ZhangFull Text:PDF
GTID:1368330572496515Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
As the digital camera,smart mobile phone and internet are widely used in people’s life,digital image plays a more and more important role in interchange and storage of information.When people exchange information with digital images,they hope ma-chines and computers can help extract and consolidate the information in those images,for example,synthesizing a series of images with overlapping regions into a complete large view image,recognizing a certain object in an image with reference information,reconstructing 3D information from images taken from different positions of the same 3D scene.A basic way to solve these problems is establishing correspondences among images,which is usually decided by local feature descriptors of image patches.Thus discriminately mapping raw image patches to feature descriptors while maintaining ro-bustness to changes of patch appearance is a basic topic in this field.In general,re-cent approaches to representing local image patches can be divided into two categories:handcrafted methods and learning-based methods.Handcrafted methods try to encode patches to feature descriptors by manually designing an invariant transform that is ro-bust to changes of patch appearance,while learning-based methods learn the invariant transform directly from data.Owing to these gifted works,this field has witnessed sig-nificant improvements in various computer vision applications.However,there are also some drawbacks for these works:encoding image patches into local feature descriptors is time-consuming,especially for learning-based methods;and the constraint guiding the mapping function is not sufficient,thus local feature descriptors may not reflect the real relationship between different image patches when discriminating their correspon-dence.In this thesis,we try to solve these problems and our works are as follows:1.Triplet ranking model is a popular way to learn local feature descriptors.How-ever,it only ensures the correct order within a certain triplet of training samples but doesn’t make use of relations between different triplets,which may lead to misjudgment when discriminate correspondences between image patches based on their feature descriptors.To solve this problem,we first propose a quadruplet ranking loss,which not only ensures the correct order of training samples but also can make full use of any combinations of positive and negative pairs to optimize the model.Then we build a novel convolutional neural network based on Siamese network and Residual network.Moreover,we design an online sampling algorith-m to enlarge the training dataset,which is helpful to train the model.Experiments prove that this model outperforms SIFT and some latest works,and shows good generalization properties in some computer vision tasks.2.Recent learning-based models often contain a large number of parameters,which leads to a high cost of memory and running time.But directly reducing the number of parameters may decrease the accuracy of.To solve this problem,we design a simplified quadruplet model.First,we analyze several typical models to find out the most time-consuming part and build a shallow neural network based on these analyses.Then,we design a new ranking loss with a stable separating margin,which is helpful to mitigate the margin varying problem of ranking losses and makes the constraint guiding the training process stricter.As a result,it only takes about several milliseconds for this model to represent a single image patch,while keeps a relatively high performance,which is suitable for real-time tasks.3.Recent methods like ranking models usually build constraints based on the prop-erties of individual image patches but overlook the statistic properties of the whole training data,which may lead to sub-optimal problem.To solve this problem,we design a variance shrinkage model based on the distribution of the whole training set.First,We use a mean value constraint that is to make distributions of corre-sponding patches and that of non-corresponding patches apart.Then,we design a variance shrinkage constraint to shrink each distribution to their peaks.The combination of these two constraints reduces the overlap between distributions of corresponding patches and that of non-corresponding patches,which discriminate corresponding patches and non-corresponding patches in a statistic way.Com-pared with ranking models and some other works,this model achieves significant improvement in performance.
Keywords/Search Tags:Local feature descriptor, Feature extraction, Convolutional neural network, Patch matching, Digital image processing
PDF Full Text Request
Related items