| With the popularity and development of the computer technology,people are not satisfied with receiving information passively,and the human-computer interaction becomes an important issue for the optimization and utilization of computers.Augmented Reality(AR),as a new way for human-computer interaction,is getting more and more attention.There have been many applications in the fields such as military real combat exercises,medical simulation surgery,car maintenance assistants,film production,interactive games,travel guides and life assistants.In these applications,it is necessary to superimpose the virtual objects or information to the real objects or scenes,in which the geometric pose including the translation,scaling and three-dimensional rotation needs to be estimated and tracked quickly and accurately.This is an important part to realize seamless combination of the virtual and the real,which will affect the people’s experience in human-computer interaction.However,there are many challenges to improve the performance of the system.On one hand,the detection of objects is time-consuming and affects the real-time performance of the system.On the other hand,to detect and track the moving object accurately will be seriously affected by the complex environments in the real scene,such as blurred images with low resolutions or low illumination conditions,the dithering of the moving object,large area occlusions,noises,sharp illumination changes or flipping with big angles,etc.As a result,three issues are studied in this paper including how to speed up the process of the object feature detection and matching,how to improve the efficiency of the frame-by-frame object pose tracking,and how to deal with the problems for the object pose estimation under complex environments.The main work is as follows:(1)A double circle structure descriptor(DCSD)of features is presented and applied to improve the speed of feature extraction and matching for feature detection.It relies on a double circle structure of overlapping regions around the feature and is with 40 bits length that can be matched fast by Hamming distance.The descriptor is rotation invariant and robust against blur,illumination changes,JPEG(Joint Photographic Experts Group)compression and orientation changes.In addition,since the ability of the binary descriptors for feature description is generally poorer than the floating descriptors,a new matching measure named Hough Voting Matching(HVM)based on clustering and Hough voting schemes is proposed.It casts the task of feature matching as a density estimation problem in the Hough space,in which all cluster correspondences are projected into the Hough space,and then they are voted by the features correspondence density.HVM efficiently removes the incorrect matches and keeps the right ones to decrease the outlier ratio and can be combined with some descriptors to improve the matching accuracy as an independent part.The comparative experiments show that DCSD-HVM algorithm can compete with the state-of-the-art floating descriptors and binary descriptors in accuracy without losing the advantage of speed.(2)An iterative optimization tracking(IOT)model based on inter-frame matching for object pose estimation is proposed,and utilized to deal with the problem that the accuracy of the pose results will be affected due to the matching strategies or error accumulations during the continuous tracking process.First,an iterative optimization method is presented,which reduces the deviations of the tracking features by iteratively adjusting the locations of the features based on the neighborhood information,and increases the proportion of valid features as much as possible.Then,a probabilistic voting method based on Bayesian conditional joint distribution function is offered for the feature classification,so as to discard the invalid features.Finally,a histogram-based score model is provided for the pose evaluation to achieve an adaptive detection interval.The developed algorithms can be combined with some inter-frame tracking schemes as an independent part for optimization.The experimental results show that the IOT model can optimize the traditional pose tracking methods based on inter-frame matching and realize the trade-off between the speed and accuracy of the system.(3)A part-based tracking method for object pose estimation(PTPE)is proposed and applied to the problem that both detection and tracking based on feature matching are difficult to achieve good results due to too few matched features to initiate the tracking process in some complex environment.PTPE is a compromise model between the global template matching and the local feature matching.First,a reverse-projection corners voting method and a density clustering strategy are provided to extract several parts of the holistic object and ensure that the parts reflect the salient local features and global structure of the object.Next,a multi-strategy fusion method to evaluate the inter-frame tracking result is offered.In this method,the parts are processed by multiple strategies such as tracking,learning and detection concurrently based on the tracking result.Third,an optimization method combining parts constraints and the online learning is proposed to improve the efficiency of the parts detection.The experimental results show that PTPE is able to estimate the object pose even if it is difficult to obtain enough features in a frame under complex conditions in real scenes.Finally,this paper compares the above three methods by experiments,analyzes the characteristics and applications of various methods,and discusses the advantages and the disadvantages of the methods and the possible directions for improvements. |