Font Size: a A A

Research On Hashing Based Methods For Large Scale Multi-label Image Search

Posted on:2017-04-09Degree:MasterType:Thesis
Country:ChinaCandidate:S S WangFull Text:PDF
GTID:2308330485982521Subject:Computer technology
Abstract/Summary:PDF Full Text Request
With the rapid development of computer information technology, all walks of life accumulated data show explosive growth trends, now, we have entered the era of big data. Big data has broad application prospects in many fields, In the application field, Big data in the field of transportation, military, education, environmental protection, medical, meteorology, and financial applications has broad prospects for development; in the field of science, big data in the field of information, chemistry, biology, physics, astronomy, mathematics also has rapid development, now, big data becomes important national strategic resources, The analysis, storage, management of big data resources have become the focus among the application and the scientific researchers. And if we want to effectively use big data, big data machine learning technology is essential. Therefore, the big data machine learning technology is one of the key content of big data research.Hash learning through a series of machine learning methods to projection data into binary hash code, hash learning also is a dimensionality reduction method, it can greatly reduce data memory overhead and improve the system efficiency of learning. Recently, hashing methods have attracted more and more attentions because of their effectiveness in large scale data search, e.g., images and videos data, etc. For different scenarios, unsupervised, supervised and semi-supervised hashing methods have been proposed. Especially, when semantic information is available, supervised hashing methods also show better performances than unsupervised ones.Today, in many practical applications, a data sample often has more than one label, with the rapid development of multimedia technology and Internet technology, multi-label data is in explosive growth, multi-label learning field has been paid more and more attention. However, few supervised hashing method considers such a multi-label scenario.In this paper, in the multi-label supervised scenario, we propose a new method for large-scale image search, called Multi-label Least-Squares Hashing(MLSH). It can directly deal with the multi-label data, in MLSH there are several learning steps to obtain the final hash function. First, MLSH uses feature extraction methods (such as GIST, SIFT) to get the feature matrix. Second, it uses the equivalent form of canonical correlation analysis and least square method to project the original multi-label data onto a low dimensional space. Then, in the low dimensional space. Third, we study the PCA projection matrix and ITQ rotation matrix to get the final binary hash code. This rotation matrix method does not need to add too much computation; however, it can greatly improve the effectiveness of the method. Then, the final hash code can be well used in ANN image search. In addition, we use MAP, Precision, Recall and other evaluation measures to do experiments on NUS-WIDE and CIFAR-100 data sets, MLSH are also compared with some state-of-the-arts, e.g., LSH, ITQ, LS-SPH, PCAH, CCA+ITQ. The results of experiments show that MLSH outperforms.
Keywords/Search Tags:hashing, multi-label, supervised
PDF Full Text Request
Related items