| With the gradual integration of big data,5G fields,and artificial intelligence into people’s lives.The widespread application of these emerging technologies has also made the amount of information that people need increasingly large.Using various encryption algorithms to encrypt data to be transmitted has become an indispensable and important part.In order to ensure the safe transmission of data,further improving the throughput of encrypted and decrypted data has become a hot and difficult technology at this stage.RISC-V instruction set architecture has become a popular technology for accelerating encryption algorithms through extended instructions due to its modular instruction subset and scalability.Based on the first commercial encryption algorithm SM4 independently designed in China,this article designs an SM4 coprocessor in the open source RISC-V processor Hummingbird E203,and extends three custom instructions based on the RISC-V instruction set architecture to drive the operation of the SM4 coprocessor.The main work accomplished in this thesis is as follows:(1)The SM4 coprocessor based on the Hummingbird E203 processor is designed and implemented using Verilog HDL language.Three RISC-V custom extension instructions are used to transfer a series of control signals from the SM4 algorithm into the dedicated registers of the SM4 coprocessor,greatly improving the efficiency of SM4 algorithm encryption and decryption.(2)Use inline assembly to extend three custom instructions.And is written into C language code in the form of inline assembly functions.(3)Compile the written C language code using the software platform Nuclei Studio IDE,and ultimately generate binary machine code.From the compilation results,it can be seen that using the SM4 coprocessor to encrypt a set of data requires 267 instructions.Using the simulation in Vivado 2019.2 to perform functional simulation verification on the SM4 coprocessor,it took 41 cycles to complete encryption and decryption operations on a set of 128 bit data at a 300 MHz clock frequency,with a throughput of 936.59 Mbit/s.The number of clock cycles for ECB encryption,ECB decryption,CBC encryption,and CBC decryption has been reduced by 98.10%,98.19%,97.99%,and 98.05%compared to the software execution results,respectively.At a 300 MHz clock,the throughput rates of ECB encryption,ECB decryption,CBC encryption,and CBC decryption increased by 51.70 times,54.16 times,48.98 times,and 50.18 times compared to software execution,respectively. |