| Secure data transmission and high performance are the basic demands for web based service providers,but cryptographic operations incur a costly overhead on computing capacity,and,consequently,degrade the performance.Recently,to accelerate the cryptographic operations,dedicated hardware accelerators have emerged to offload complex operations to Application Specific Integrated Circuit(ASIC)hardware.However,it is challenging to provide a test-suite to quantify the acceleration benefits attained by these accelerators for cloud infrastructure providers.This is because these accelerators are provided with high heterogeneous runtime modes by different manufacturers.Moreover,it is complicated for these testsuites to run the benchmarks while making full use of the accelerators' capacity to be a fair and valid testbed.To this end,we setup a flexible and fully-functional testbed to support the diverse accelerators and running modes for typical cloud applications and cryptographic algorithms.A pressure benchmark,based on Apache Bench,is proposed to test the throughput and latency performance,as the metric of acceleration on performance improvements achieved by the accelerators.A series of micro-level and macro-level workloads are involved in the test-suite to mimic practical cloud applications.Micro-level workloads are conducted to measure the performance improvement of offloading cryptographic algorithms,among which 2048-bit RSA,ECDSAP,ECDH and chained cipher algorithms are selected as classic micro-level workloads.Then,the Nginx server,Redis and MySQL back-end databases are chosen as typical macro-workloads to evaluate many different aspects,including throughput and latency.Among many industrial accelerators,Intel's QAT,Cavium's Nitrox,and Exar's DX2040 are representatively chosen for evaluation.The testing results and comparisons are elaborated in detail in comprehensive scenarios.Their performance enhancements for the cloud infrastructure are deeply examined and product-to-product comparisons are discussed in detail.We found that an asynchronous working mode is a necessity for significantly accelerating the processing of cryptographic operations.Furthermore,heavy cryptographic algorithms are extremely CPU-intensive and are suitable for offloading onto dedicated accelerators.Less complicated algorithms are lightweight enough and the offloading overhead can be a dominate factor that impacts their performance,thus are inappropriate to offload onto dedicated accelerators.In addition,CPU resources can be largely saved by using accelerators while maintaining the same throughput. |