| In recent years,with the booming of computer technology,the increasing amount of information and the advanced technological conditions and hardware resources,this has greatly facilitated the research and development of neural networks,encouraging the research of increasingly complex and more practical models.Machine learning models are used in more and more fields,but the problem that needs to be faced now is that we should protect AI models from illegal theft,distribution,and abuse(i.e.,intellectual property protection),which is crucial to the development of the AI industrialization process.To protect model copyright,many digital watermarking algorithms have emerged,commonly known as trigger-based black-box watermarking techniques.The model owner verifies model ownership by overlaying a trigger to an original input as a watermark sample,and when the trigger appears during testing,the model outputs a predefined target label.Also,watermarking attack techniques have been widely studied,and common ones include fine-tuning,model pruning,and other watermarking removal attacks.An effective watermarking technique should be robust to most watermarking attack techniques,while a good watermarking attack technique also tends to help researchers design more robust watermarking techniques,and the two complement and advance each other.In this paper,we propose two watermark removal schemes based on neural network model pruning,and we investigate the sensitivity of neurons and channels in the neural network neurons and channels in the neural network are sensitive to the watermarked samples,and by pruning these sensitive neurons and channels to remove the watermark,which can reduce the watermarking success rate to a certain extent,and based on this,the effective evasion model ownership detection is validated.The main work includes:1.In this paper,we find that by adversarially perturbing neurons,watermarking models with added triggers can exhibit the same behavior even without triggers device can exhibit the same behavior and output misclassification more easily than an unwatermarked model,so we propose Therefore,we propose a simple model repair method,i.e.,adversarial neuron pruning,to remove watermarking;2.In this paper,we study the correlation between channel Lipschitz constants and watermarked samples containing triggers,and the watermarked channels have high Lipschitz constants,and pruning these sensitive channels can achieve watermark removal.Therefore,we propose a watermark removal scheme based on channel pruning;3.In this paper,we test the above two methods on three different watermark embedding methods and a blind watermarking framework,and experimentally demonstrate that they can reduce the watermarking success rate and evade model ownership verification successfully to some extent. |