Font Size: a A A

Zero-shot Image Classification Based On Autoencoder

Posted on:2023-02-03Degree:MasterType:Thesis
Country:ChinaCandidate:C ZhangFull Text:PDF
GTID:2568306788466444Subject:Information and Communication Engineering
Abstract/Summary:PDF Full Text Request
Zero-shot image classification uses the seen class data to train,and transfers the knowledge of the seen class to the unseen class with the aid of auxiliary information,thus realizing the recognition of the unseen class.It effectively solves the problem of insufficient new species labeling in reality and promotes the development of image classification,which has great research and application value.Aiming at the problems of domain shift,insufficient attribute description and hubness in zero-shot image classification,two zero-shot image classification models based on autoencoder are proposed on the basis of semantic autoencoder model,combining with contrastive learning,semantic constraint and visual constraint.The main research is as follows:1.Aiming at the domain shift problem in zero-shot image classification,a zeroshot image classification model based on unseen classes semantic constraint autoencoder is proposed.Firstly,the pre-trained Res Net101 network was used for feature extraction,and the visual center of the seen class was obtained by computing,and the visual center of the unseen class was obtained by unsupervised clustering algorithm.Secondly,an encoder is used to map the visual center of the seen class to the semantic space and align it with the semantic class prototype.Then,a decoder is used to reconstruct the mapping semantic vectors of seen classes into visual features,which are aligned with the visual center.Meanwhile,the unseen classes semantic constraint is used to constrain the training of autoencoder.Finally,based on the similarity between the semantic vector of test image predicted by encoder and the semantic prototype of each test class,the nearest neighbor algorithm is used to achieve zero-shot image classification.Finally,calculating the similarity between the semantic vector of the test sample and the prototype of each test class in the semantic space,and then using the nearest neighbor algorithm to achieve zero-shot image classification.2.Aiming at the insufficient attribute description problem and hubness problem in zero-shot image classification,a zero-shot image classification model based on hidden attribute extension and unseen classes visual constraint autoencoder is proposed.First,the pre-trained contrast learning model is used to extract hidden attributes of all object classes,and a fully connected network is used to combine hidden attributes with existing semantic attributes to form mixed attributes.Secondly,the visual space is used as the embedding space,and the mapping model from semantic space to visual space is learned by autoencoder.Meanwhile,the unseen classes visual constraint is used to constrain the training of autoencoder.Then,the mixed attributes of the unseen class are fed into the trained encoder to obtain the predicted visual center of the unseen class,which forms the visual space together with the real visual center of the seen class.Finally,calculating the similarity between the visual feature vector of the test sample and the prototype of each test class in the visual space,and then using the nearest neighbor algorithm to achieve zero-shot image classification.In this thesis,experiments are carried out on Aw A2 dataset and CUB dataset respectively,and the results are compared and analyzed.It is verified that the two models proposed in this thesis can effectively improve the performance of zero-shot image classification.A total of 30 figures are included in this dissertation,as well as 8 tables and 106 references.
Keywords/Search Tags:zero-shot image classification, autoencoder, domain shift, contrastive learning, semantic constraint, visual constraint
PDF Full Text Request
Related items