| The performance of existing deep learning models largely depends on the amount of labeled image data,which is difficult for traditional image recognition algorithms to obtain a large amount of labeled image data when identifying rare species or new species that appear with social development.Zero-shot image recognition technology emerged as the times require,specializing in how to maintain high image recognition accuracy in scenarios with limited data.The starting point of zero-shot learning is to imitate the human learning process.Humans can directly derive the concept of an unlearned object from the concepts they have already mastered.Zero-shot learning aims to integrate auxiliary information data into existing image recognition datasets to help deep learning models achieve concept transfer.Auxiliary information is usually the semantic information for extracting specific species concepts.In original zero-shot learning algorithms,mapping-based methods are usually used to map image features to semantic space.The model is trained such that each mapped image feature is close to its corresponding semantic vector,and a nearest neighbor search is performed in the semantic space during testing.While mapping-based methods perform well in traditional zero-shot learning,they perform extremely poorly in more realistic scenarios.Due to the rapid development of Generative Adversarial Networks,it has become possible to generate object-specific image data from semantic information.The generative zero-shot learning method based on generative adversarial networks can supplement a large amount of image data for zero-shot items that can be used to train deep learning networks,and significantly improve the performance of the model in generalized zero-shot learning scenarios.(1)Most zero-shot learning methods based on generative models use generative adversarial networks,however,this method has problems such as mode collapse and training difficulties.This paper introduces a new generator-the diffusion model.The model gradually adds Gaussian noise and performs denoising processing,and uses U-Net to predict the noise in the original image features,which is easy to train and easy to converge.Experimental results on CUB,AWA2,SUN,and FLO datasets show that the diffusion model improves the quality of image feature generation while ensuring an efficient training process.(2)Image feature confusion has always been a problem that limits the classification accuracy of image feature classifiers.Contrastive learning is an unsupervised learning method that can solve the problem of feature confusion.In this paper,the method of contrastive learning is used to embed the generated image features into the intermediate space,and classify by comparing the similarity of image feature samples in the same category and different categories.The experimental results show that the contrastive learning module can effectively alleviate the confusion of image features,and show a good classification effect on the CUB,AWA2,SUN,and FLO datasets.(3)This paper successfully trains a zero-shot learning model based on diffusion model and contrastive learning,and applies it to a rare flower recognition system.The system adopts a browser/server architecture,and the front end is built with a vue framework to provide functions such as user registration and login,image upload,image recognition result display,and user feedback.The backend uses the nvidia triton framework to complete functions such as image recognition calculation processing,system log collection and data storage. |