Font Size: a A A

Zero-shot Learning Research Based On Deep Generative Models

Posted on:2020-03-19Degree:MasterType:Thesis
Country:ChinaCandidate:X ZhouFull Text:PDF
GTID:2428330596475068Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
Large-scale human-labeled image databases have propelled the development of deep learning,which achieves state-of-the-art performance in various computer vision tasks.In real-world scenarios,however,one happens when labeled instances are unavailable for some classes,due to the heavy burden of collecting and annotating sufficient training data for ever-growing classes.Obviously,it is inapplicable for traditional classification methods to recognize previously unseen class instances.Zero-shot learning(ZSL)has been recognized as an effective way to tackle this problem and attracted increasing interests in the vision communities.ZSL aims to recognize objects without seeing any visual instances by learning knowledge transfer between seen and unseen classes.Attribute based ZSL approaches,which introduce an intermediate semantic space of attributes for knowledge transfer,have shown impressive performance.In semantic space,seen classes and unseen classes are depicted by high-dimensional vectors.However,previous methods have three problems.First,the semantic space used for recognition may be unreliable due to noise class embedding or visual bias problems.Second,as the unknown classes,we lack any knowledge about these classes.So,providing attribute annotations for the unseen class at test time is timeconsuming and labor-intensive.Besides,these are some attributes which correspond to very different visual appearances for some classes.Directly using attributes as intermediation(space)for knowledge transfer,which inevitably leads to the projection domain shift in test time.At the same time,in high-dimensional space,some points become the nearest neighbors of most points.This also makes the method based on the method aggravating the hubness point problem in the prediction.For noise embedding problem or visual bias problem,we propose a novel zero-shot learning method based on visual-attribute embedding,which recognize unknown categories in the intermediation Hamming space.Specifically,our method learns two binary encoding functions,and visual features and class embedding are mapping into Hamming space,which relieves visual semantic bias problem.During training,by introducing two additional auxiliary variables in the original mapping function,we rewrite the original problem into an equivalent maximization problem,which has an analytical solution.Therefore,the proposed algorithm enjoys both efficient training speed and scalability.A large number of experiments on four benchmark datasets demonstrate the superiority and high scalability of our approach to zero-sample learning tasks.In this paper,we propose to use the deep generative network,a novel attribute based ZSL method that leverages word embeddings of class names as complementary for attributes and performs zero-shot prediction with only word embeddings of unseen classes names.Specifically,in the training phase,the proposed method considers the visual features,attributes,and word embeddings as three different views of visual instances,and allocates each view with a auto-encoder.the proposed method simultaneously ensures robust view-specific reconstructions and cross-view synthesizing,while preserving the discrimination of class labels.Without the attributes for unseen classes,the proposed method can synthesize their word embeddings to the visual feature space to perform zero-shot prediction.Besides,the proposed method is flexible to learn with partial views,where either attribute view or word embedding view is missing during training.For project domain shift and hubness problem which are caused by directly using attributes as intermediate semantic spaces.Our method can generate visual features in the visual space,avoiding the above two problems.Extensive experiments on four benchmark datasets show that the proposed method effectively overcomes the aforementioned drawbacks and achieves superior performance compared with state-of-the-art ZSL methods.
Keywords/Search Tags:Zero-shot Learning, Deep Generative Model, Multi-view Learning, Embedding Space, word Embedding
PDF Full Text Request
Related items