| The application of imaging equipment such as digital cameras and surveillance cameras,is becoming increasingly widespread,resulting in a large amount of visual data and giving rise to a variety of visual tasks,such as image classification,image retrieval,image segmentation,object recognition,and target detection.Due to the great success of deep learning in natural language processing,many researchers have introduced deep learning techniques into visual tasks,designing various visual neural networks that effectively improve performance.Image retrieval is an important visual task aimed at retrieving image data similar to the content of a given query image.In the era of big data,how to quickly and accurately retrieve similar images to a given query image from massive images has become an important issue in the field of image processing research.Image hashing is an efficient technology for solving image retrieval tasks,which calculates a binary feature code based on the visual content of an image.The research on high-performance image hashing algorithms for retrieval tasks using visual neural networks is currently a hot research topic.This thesis summarizes the main problems and solutions in visual neural network technology by studying the mainstream techniques.On this basis,two new image hash retrieval algorithms are designed using visual neural network technologies such as knowledge distillation,generative adversarial networks,and contrastive learning.These algorithms include a contrastive learning and generative adversarial network-based image hash retrieval algorithm and a knowledge distillation and optimal transport-based face hash retrieval algorithm.The main research contents of this thesis are as follows.(1)Survey on Visual Neural Network TechnologyVisual Neural Network(VNN)is one of the most important topics in the field of deep learning,with applications covering image classification,image segmentation,object detection,and object recognition.Generally speaking,VNN can be divided into two types: Convolutional Neural Network(CNN)and Transformer Network.In the past decade,CNN has dominated research in visual tasks.In recent years,Transformer networks have been successfully applied in natural language processing and computer vision,and have achieved significant performance improvements in many visual tasks.This thesis first introduces the basic structures and technical points of these two types of VNN,then summarizes the three main challenges faced by VNN:scalability,robustness,and interpretability.It then outlines lightweight,robust,and interpretable solutions,and finally summarizes the future research opportunities for VNN.(2)Image Hash Retrieval Algorithm based on Contrastive Learning and Generative Adversarial Networks is proposedTo address the issues of poor robustness and excessive parameters of traditional visual neural networks,this paper proposes a lightweight and robust image hash retrieval algorithm utilizing contrastive learning and generative adversarial networks.By employing self-supervised adversarial training,the algorithm obtains a robust teacher network,and then trains the student network using generative adversarial networks to enhance the network model’s robustness.Next,the algorithm imitates immune injection to distill knowledge while effectively compressing the network,ensuring model performance.Finally,an attention mechanism based on convolution modules is utilized to extract the image hash sequence.Experimental results indicate that the proposed image hashing outperforms various benchmark hash retrieval algorithms,with better robustness and fewer model parameters.(3)Face Hash Retrieval Algorithm Based on Knowledge Distillation and Optimal Transport Technology is proposedIn response to the problems of excessive model parameters,high computational complexity,and inaccurate fine-grained retrieval of visual neural networks,this article proposes a lightweight facial hash retrieval algorithm based on knowledge distillation and optimal transport technology.One important contribution of this algorithm is the design of a triplet knowledge distillation based on attention mechanism,where the loss function consists of attention loss,Kullback-Leibler loss,and identity loss.This knowledge distillation scheme can focus on the salient regions of the face and reduce network parameters.Another contribution is the design of a hash quantization scheme based on optimal transport technology.This scheme partitions the facial feature space by calculating class centers and uses optimal transport technology to achieve binary quantization,effectively improving hash retrieval performance.In addition,an alternating training strategy is designed to fine-tune the parameters of the lightweight hash network.Experimental results show that the lightweight hash retrieval algorithm performs better than some famous hash retrieval algorithms on two benchmark facial datasets. |