Font Size: a A A

Robustness Of Machine Learning Models Based On Adversarial Examples

Posted on:2024-01-26Degree:MasterType:Thesis
Country:ChinaCandidate:S C HouFull Text:PDF
GTID:2568307067472564Subject:Computer technology
Abstract/Summary:PDF Full Text Request
With the development and progress of artificial intelligence,machine learning has made significant breakthroughs in many fields,such as face recognition and autonomous driving,which have been widely used.However,while machine learning brings convenience to human beings,it also gradually exposes a series of security risks,for example,machine learning models represented by neural networks are highly vulnerable to attacks by adversarial examples crafted by imperceptible perturbations due to their black-box characteristics and uninterpretability,which leads to poor model robustness.Numerous methods have been proposed to improve the robustness of the models,among which adversarial training is considered an important means to improve the robustness.The essence of adversarial training is to improve model robustness and generalization ability by introducing adversarial examples and regularizing model parameters.However,the existing adversarial training methods fail to effectively distinguish the intrinsic characteristics of the adversarial examples in different scenarios,so the improvement of model robustness is limited.To address this problem,two improved adversarial training methods are proposed in this thesis to improve model robustness while being able to maintain the model’s ability to generalize to normal examples.The specific work is as follows:(1)An Improved Misclassification Aware Adversarial Training with Entropy-Based Uncertainty Measure(E-MART).Misclassification Aware adve Rsarial Training(MART)is an effective adversarial training method,which incorporates an explicit differentiation of misclassified examples as a regularizer.However,MART uses only the prediction error to identify misclassified examples,i.e.,it only focuses on the output probability regarding with the ground-truth label,neglecting the impact of the complement classes,thus making it fails to achieve the greatest performance.In this thesis,we propose an improved MART method with entropy-based uncertainty measure.Specifically,we consider the impact of the outputs from all classes and develop an entropy-based uncertainty measure(EUM)to provide reliable evidence that indicates the impact of the misclassified and correctly classified examples.Moreover,based on EUM,we conduct a soft decision scheme to optimize the loss function of AT,which help to make the efficient training for the model’s final robustness.We have carried out experiments on CIFAR-10 dataset,and the experimental results show that our method is effective in improving model robustness without losing the natural sample prediction accuracy.(2)Critical-PGD-Steps Perturbation Aware Instance-Reweighted Adversarial Training(CPAIR).Reweighted adversarial examples in adversarial training have been shown to be effective in improving model robustness.The main idea of this approach is that natural examples closer/farther from the decision boundary have lower/higher robustness,and the corresponding adversarial examples should be assigned larger/smaller weights in adversarial training.The existing methods usually use LPS as a measure of the "distance from the examples to the decision boundary",where LPS refers to the minimum number of PGD steps that enable an adversarial variant of a natural examples to cross the decision boundary in the projected gradient descent algorithm(PGD).However,LPS cannot effectively distinguish examples with the same number of PGD steps but different robustness.To address these problems,this thesis proposes an adversarial training method based on the perturbation-awareness of critical PDG steps.First,the maximum number of PGD steps required by the adversarial examples before crossing the decision boundary is calculated as the critical PGD steps,and the minimum perturbation required to cross the decision boundary is calculated based on the critical PGD steps;second,based on the critical PGD steps and its corresponding minimum perturbation,a distance metric of critical steps perturbation-awareness is designed to measure the distance from the examples to the decision boundary;finally,weights are dynamically assigned to the adversarial examples in the reweighted adversarial training based on this distance metric.Experiments show that the Critical-PGD-Steps Perturbation Aware Instance-Reweighted Adversarial Training outperforms the state-of-the-art similar algorithms.
Keywords/Search Tags:Adversarial Examples, Adversarial Training, Misclassification Aware, Instance-Reweighted Adversarial Training, Critical PGD Steps Perturbation Awareness
PDF Full Text Request
Related items