With the advent of the information era, more and more High-Definition resolution or even ultra-clear resolution videos are proposed. In January 2013, JCT-VC released the latest video compression standard HEVC. This standard adopts Quad-tree-based coding structure, larger coding blocks and more prediction directions. Thus the compression efficiency doubles than that of the existing H.264 standard.HEVC standard adopts Context-based Adaptive Binary Arithmetic Coding(CABAC)as the entropy coding algorithm. CABAC plays a very important role in the HEVC standard. It can encode residual information based on a precise context, which nearly achieves Shannon entropy. However, CABAC, which encodes sequentially bit by bit, makes it become the main bottleneck of limiting data throughput of HEVC video compression.A new High-Level Synthesis(HLS)tool for Xilinx Inc. developed recently. It helps the engineer leave out the Register Transfer Level(RTL)design and the detail structure of Field Programmable Gate Array(FPGA). The verification procedure is also relatively simple. HLS can greatly reduce the hardware system development cycle.In order to solve the throughput bottleneck problem of CABAC, this paper, based on carefully studying of the HEVC standard and the CABAC algorithm, analyses the throughput bottleneck of CABAC from the algorithm. Then we use HLS tool to design and implement the core structure of CABAC encoder. The main work of this paper includes:1.A multilevel pipeline CABAC hardware architecture is presented. We divide CABAC into three sub modules including bit stream encoding, packaging and stream output. Then set up the pipeline with three modules by the branch prediction method to remove the data correlation between bit stream encoding module and stream output module.2.Implementation of CABAC hardware structure which meets 4K real-time video compression with HLS Tools is finished. Aimed at the proposed CABAC hardware structure, we implemented CABAC structure of regular and bypass mode and optimized by using HLS tool. The partition constraint will map an array to registers to improve theaccess speed of data. The unroll constraint will change the loop body executing serially to concurrently to shorten the processing time delay. The pipeline constraint will establish a multi pipeline hardware structure. Finally, we carried out the RTL function simulation of the CABAC encoder by HLS.In this paper, a multilevel pipeline CABAC hardware architecture in HEVC was proposed, and implemented this structure by HLS tool. Experimental results show that the architecture can achieve a high data throughput rate, which meets the need of 4K resolution video real time compression. The core architecture of CABAC engine proposed in this paper has a low algorithm complexity. It can be implemented on a single FPGA, and has very high practical value. |