Font Size: a A A

Research On Data Forgetting In Deep Neural Networks

Posted on:2024-01-15Degree:MasterType:Thesis
Country:ChinaCandidate:Y HuangFull Text:PDF
GTID:2568307067472594Subject:Computer technology
Abstract/Summary:PDF Full Text Request
Artificial intelligence,with deep learning technology as its core,has rapidly developed over recent years and has been widely applied in numerous fields including national defense,medical health,and intelligent transportation.The integration of deep learning has promoted the progression of society towards informatization and intelligence.However,research has revealed that well-trained models store a significant amount of training data information,and attackers could potentially steal the information of such data by reverse engineering the model,resulting in a serious threat to data privacy and security.Machine unlearning can eliminate specific data information from deep models without explicitly impacting the performance of the model,offering an effective solution to the data privacy and security in the intelligence era.Currently,there has been progress in machine unlearning.However,several problems still persist.These problems may include,but are not limited to,the following: existing unlearning algorithms are typically designed for specific scenarios and are unable to handle multiple scenarios;machine unlearning originally developed to protect data privacy is vulnerable to leak forgotten data.Thus,this thesis conducts an in-depth analysis on these problems.The main contribution and research content of this thesis are summarized as follows:(1)This thesis proposes a third-party data-based unlearning algorithm to address the multiscenario unlearning problem.The proposed method facilitates the unlearning in various scenarios such as backdoor unlearning,class unlearning,and subset unlearning.Notably,the proposed method requires no additional storage and does not need any modifications the paradigm of machine learning when unlearning data.The primary concept underlying this method is to fine-tune the model parameters to minimize the Kullback-Leibler divergence between the prediction vector of the forgotten data and those of third-party data.Experiments conducted on various standard datasets demonstrate that,the proposed method is capable of unlearning data in multiple unlearning scenarios,without causing noticeable performance degradation.Additionally,compared to the baseline method,the proposed method significantly improves time efficiency.(2)This thesis proposes a label stealing attack based on model inversion to investigate the data leakage problem in machine unlearning.After an in-depth analysis of the existing machine unlearning methods in class forgotten scenario,it has been observed that there is a significant difference in the prediction vector of forgotten data compared to the prediction vector of other data.Based on this observation,this thesis proposes feature inversion algorithm using gradient optimization and genetic algorithm in black-box and white-box scenarios,aiming to obtain the prediction vector of each class data.This thesis further proposes a screening approach based on threshold and entropy to filter out the forgotten data from the obtained prediction vector.The experimental results conducted on various standard datasets demonstrate the effectiveness of the proposed method in attacking various unlearning methods,with a more pronounced attack effect,compared to the baseline method.
Keywords/Search Tags:Deep Learning, Privacy and Security, Machine Unlearning, Right to Be Forgotten, Data Leakage
PDF Full Text Request
Related items