Font Size: a A A

Research On Model Extraction And Membership Inference Attacks Against Machine Learning Classification Models

Posted on:2024-08-23Degree:MasterType:Thesis
Country:ChinaCandidate:P P YangFull Text:PDF
GTID:2568306932962029Subject:Information security
Abstract/Summary:PDF Full Text Request
Research on machine learning model attack methods is of great significance for improving the security,confidentiality,and robustness of machine learning models.By studying the attack methods of machine learning models,vulnerabilities and issues in the models can be identified,and corresponding solutions can be proposed to protect the security and stability of machine learning models,increase people’s trust in machine learning,and promote the widespread application and development of machine learning technology.This thesis focuses on the model extraction attack and membership inference attack methods of machine learning classification models.In terms of model extraction attacks,this thesis proposes a data pre-generation framework to reduce query costs and training time by generating training data before training alternative models.In terms of membership inference attacks,this thesis proposes a method based on shadow models to conduct membership inference attacks when the target model only returns labels.In general,this thesis investigates the model extraction attack and membership inference attack methods in classification models.The main contributions are as follows:·This thesis proposes a model extraction attack method based on a data pregeneration framework.Existing methods for attacking without access to training data suffer from query redundancy and a long training time.Therefore,this thesis proposes to synthesize training data in advance before training substitute models in order to improve the speed of model training and reduce query costs.Specifically,we first train a generation model on a public dataset to generate unlabeled samples and obtain the initial dataset by querying the target model.Next,we further propose category balance and robustness filtering strategies to select samples and obtain the pre-generated training dataset.Finally,we use this dataset to train the substitute model locally.We evaluated our method on three widely used image classification datasets,including CIFAR-10,MNIST,and FashionMNIST.Compared with previous methods,our method greatly reduces the query cost and training time for model extraction attacks while achieving comparable or even higher substitute model accuracy.·This thesis proposes a member inference attack method based on shadow models and sample optimization.To address the scenario where the target model only returns labels,we use shadow models to estimate the prediction probability vectors of samples instead of the target model.Specifically,we first train a local shadow model using generated data and construct the training data for the attack model based on the shadow model.Finally,we train a member inference attack model.In addition,to help the shadow model better learn the decision boundary of the target model,we further propose a sample optimization strategy to adjust the generated data.Compared with other methods that use adversarial sample attacks to deal with scenarios where the target model only returns labels,our method has advantages in query cost by utilizing shadow models.We evaluated our method on two popular image classification datasets,CIFAR-10 and MNIST,and the experimental results demonstrate the superiority of our method.Moreover,ablation experiments also demonstrate the effectiveness of the sample optimization strategy.
Keywords/Search Tags:privacy preserving, model extraction attack, membership inference attack, data-free, label
PDF Full Text Request
Related items