| Program learning is defined as the task of developing algorithms that satisfy certain specifications or a set of constraints.Pre-trained language models have played a significant role in various natural language processing tasks and have provided feasible solutions for program learning tasks.Therefore,program learning based on pre-trained language models is meaningful for accelerating the progress of program learning and exploring the working mechanism of language model transfer to program learning tasks.It is also practically significant to find a landing application point(such as in the field of web application firewall testing)for program learning based on pre-trained language models.Current program learning tasks mainly use traditional search algorithms or simple deep learning frameworks.This approach requires the prior design of domain-specific languages and training with massive program learning data.Furthermore,a specific model duplication is needed for each task.The success of generative pre-trained language models like GPT-3 demonstrates the model’s transferability across different tasks,and it only requires a small amount of training data.Therefore,the first primary objective of this paper is to design challenging program learning tasks for existing generative pre-trained language models.Through self-attention attribution,we analyze the data flow in the model,identify specific attention heads in the model,and increase their output weight,improving the model’s average accuracy rate by 6.71%on program learning tasks.Then,based on the self-attention attribution weights,the model is pruned,achieving optimal model pruning at 87.2%of its original size while maintaining accuracy.Finally,the study concludes that as long as the training tasks are similar,the critical attention heads are highly correlated.The second primary objective of this paper is to investigate the application of program learning methods based on pre-trained language models in the field of web application firewall testing.The purpose of this second objective is twofold:on one hand,to study the practical application effect of program learning methods based on pre-trained models,and on the other hand,to address the issues encountered with traditional machine learning methods in testing web application firewalls and improve the effectiveness and efficiency of the tests.Web Application Firewalls are widely deployed to protect key web applications against multiple security threats,so it is important to test WAFs regularly to prevent attackers from bypassing them easily.Machine-learning-based black-box WAF testing is gaining more attention,though existing learning-based approaches have strict requirements on the source and scale of payload data and suffer from the local optimum problem,limiting their effectiveness and practical application.We propose program learning based on pre-trained language models for testing web application firewalls.The approach fine-tunes a Generative Pre-trained Transformer language model with reinforcement learning to make it have the least restrictions on payload data and thus more applicable in practice,and we use reward modeling and KL-divergence penalty to improve the effectiveness of our approach and mitigate the local optimum issue.We implement and evaluate it on two well-known opensource WAFs against three kinds of common attacks.Experimental results show that our approach significantly outperforms state-of-the-art approaches,i.e.,ML-Driven and RAT,finding up to 7.8×(3.2×on average)more bypassing payloads within 1,250,000 requests,or finding out all bypassing payloads using up to 8.1 ×(3.3×on average)fewer requests. |