Font Size: a A A

Alternative Image Generation Method For Privacy-Preservation

Posted on:2024-05-16Degree:MasterType:Thesis
Country:ChinaCandidate:W Y LiFull Text:PDF
GTID:2568307064985179Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
Image datasets have played a fundamental role in the development of machine learning research.As the scale of datasets continues to grow,more and more real datasets are applied to computer vision algorithm research,and incidents of data privacy information leakage emerge in endlessly.The leakage of private information will not only threaten the personal and property safety of users,but even endanger the public safety of the company and the country.Therefore,it is of great practical significance to study methods that can protect data privacy information.According to whether the dataset is public or not,image datasets can be divided into public datasets and private datasets.According to the sensitivity of the information in the dataset,this research divides the dataset into partial privacy dataset and full privacy dataset.Currently mainstream machine learning algorithms for data privacy protection,such as homomorphic encryption,differential privacy,secure multi-party computation,etc.,mainly focus on the privacy protection of private datasets and cannot be applied to public datasets.For the privacy protection of public datasets,existing research usually adopts methods such as data fuzzing and information clipping.However,these methods can only deal with partially private datasets and are not suitable for public fully private datasets.Aiming at the above problems,this paper designs a privacy-preserving scene of an image dataset based on an alternative image and a generation method of an alternative image dataset.Specifically,the main work of this paper is as follows:1.In view of the privacy protection requirements faced by the existing public fully private image datasets,designed a privacy-preserving scenario of image datasets based on alternative images and a method for generating alternative image datasets.The scenario designed in this paper replaces the original image dataset with a alternative image dataset processed by a privacy-preserving method.In this scenario,the replacement image corresponds to the original image one-to-one,and humans cannot identify the category of the alternative image,but the alternative image can train the existing deep learning image classification algorithm,and has a good classification effect.2.The existing dataset privacy protection methods can not well adapt to the requirements of the above scenarios.Therefore,an alternative image generation method based on adversarial attack privacy protection has been proposesed.The characteristics of the feature,retaining the robust features of the original image and introducing the non-robust features of the random image,can generate an alternative image that meets the needs of the scene.The robust features of the original image can ensure that the alternative image can replace the original image to train the neural network,that is,guarantee the substitutability of the alternative image.The non-robust features of random images can ensure that the generated alternative images are far away from the original image in visual angle,making the generated alternative images unrecognizable by humans,that is,ensuring the privacy of the alternative images.3.In order to further meet the needs of this scenario,this paper has constructed a privacy-preserving surrogate image generation model,which is divided into an adversarial training module,an alternative image generation module,a data enhancement module and a testing module.The confrontation training module has used the confrontation training method to generate a robust model,and the alternative image generation module has used the alternative image generation method to generate alternative images for the images of the original data set one by one to form a alternative version dataset.The data enhancement module has introduced a data enhancement algorithm and uses the alternative dataset.The classification model is trained,and the test module uses the original test-set to test the classification model.This model makes it possible to publish an alternative version of the dataset in scenarios where the privacy of the original dataset is high,without providing the original dataset,and does not affect the classification performance of the deep learning image classification model on the standard test-set.4.This paper has used the CIFAR dataset and the CINIC dataset to verify the proposed dataset privacy-preserving scenario and its alternative image generation method.First,this study has used the proposed alternative image generation method to generate an alternative image for each image in the original training set to form an alternative version training set.Then,a classic neural network-based image classification algorithm is trained using the alternative version training set.Finally,the trained image classification algorithm is tested on a standard test-set to see if an alternative dataset can train the existing method.On the standard test-set,the existing method achieved test accuracies of 87.15% and 77.21% on image classification tasks after being trained on the alternative CIFAR dataset and CINIC dataset.In addition,this paper presents a visual analysis of the generated alternative images.The experimental results show that the alternative images generated by the method proposed in this study can be used as a better quality image classification training set to train the existing methods on the premise of better protecting the privacy of the original dataset,so that the existing methods maintain a better performance.
Keywords/Search Tags:Deep Learning, Privacy Protection, Computer Vision, Adversarial Attack, Adversarial Example
PDF Full Text Request
Related items