Font Size: a A A

Research On Front-end Design Of VVC Intra Encoder Chip For 8K30

Posted on:2023-12-16Degree:MasterType:Thesis
Country:ChinaCandidate:Y Q YeFull Text:PDF
GTID:2558307097978559Subject:Electronic Science and Technology
Abstract/Summary:PDF Full Text Request
In recent years,in order to meet people’s demand for high-quality video,the video resolution has changed from 1080 p to 4K,and gradually developed to 8K.The resolution of 8K30 ultra high definition video has reached8192×4320 pixels,and the amount of data transmitted at 30 frames per second is enormous,making it important to study coding standards that compress the data better.VVC,the new coding standard released in 2020,can save about50% of the code rate under the same picture quality as the previous version of HEVC,so VVC is an important way to solve the problem of 8K30 ultra high definition video transmission,intra encoder is one of the key technologies.However,the computation of intra encoder is very complicated,which takes up a large amount of hardware resources of the whole encoder,and the hardware cost of VVC version is much higher than that of HEVC version.Two main reasons for the increase of hardware cost are as follows: First,VVC adopts the new technology of MTT,adding rectangle prediction block and transform block on the base of the former block;Second,the angle pattern of intra prediction has been increased from 33 to 65 in HEVC.In order to improve the processing speed,the parallelism must be increased,and the hardware cost is doubled.Therefore,this paper designs an intra encoder based on 8K30 performance,and studies the Intra encoder from two aspects of application flexibility and hardware cost reduction.The main innovations are as follows:(1)In order to improve the throughput of intra encoder and reduce the data dependence between different CU blocks,a fully parallel intra encoder architecture is designed.This parallel architecture introduces seven computing engines that support different sizes of CUs,including four kinds of square CUs(4 × 4、8 ×8、16 ×16、32 ×32)and six kinds of rectangular CUs(16 ×8、8 ×16 、32 ×8、32 ×16 、16 ×32 、8 ×32).If each computing engine is designed separately,it will be a waste of time and energy for later verification and maintenance.Therefore,this paper designs a set of flexible hardware structure for the data reconstruction part of the intra encoder,which can achieve different functions by changing the value of the configuration parameters,reducing the seven-way design module to one,which greatly saves the development cost.(2)The current mainstream DCT technology generally uses input signals to control the type of computing,so it is impossible to customize the computing resources of each CU size.In this paper,a structure of automatic optimization of hardware cost is proposed.Every time the same module is used under different configuration parameters,only the hardware circuit necessary to realize the corresponding configuration function under this structure will be automatically called,and there is no redundant circuit.When the throughput rate is constant,the hardware cost of configuring the same 2D DCT circuit to support only a single 4x4 CU block calculation is only about3% of that of supporting all size calculations.(3)In the recently published literature,most of the DCT calculations are performed in a butterfly structure,so the throughput is usually limited to a multiple of 4,and the small-size throughput is often smaller than the largesize throughput,this not only makes the processing of subsequent modules more complex,but also reduces the processing rate of the whole structure.In this paper,the circuit with 1 pixel/cycle throughput is refined into a basic circuit,and the throughput can be designed to any value according to the demand.By using this method,the configuration function of circuit throughput rate is realized,and the user can configure the throughput rate to 8pixel/cycle or 16 pixel/cycle according to the application situation.This provides flexibility to meet subsequent performance requirements and minimizes the cost of later code changes.(4)Compared with other literatures that support the calculation of several square CU transformations,this paper not only increases the calculation of rectangular blocks on the basis of square blocks,but also can realize the calculation function of single size or several sizes by changing the values of configuration parameters,which greatly improves the flexibility of application.Moreover,the method of storing multiple data with single address is proposed in the transposed storage module,which reduces the number of SRAM chips used by the mainstream technology based on single address deposit receipt data by 50%.The comprehensive results show that the 2D DCT/IDCT structure proposed in this paper,which can support the most kinds of CU size calculation,can operate at a maximum frequency of 465 MHz,and the normalized area is reduced by up to 70% compared with other existing literatures.With the proposed quantization structure,the normalized area is reduced by 54.6%.Performance analysis shows that the VLSI architecture designed in this paper can support the real-time coding requirements of UHD video 8K30.
Keywords/Search Tags:VVC, 8K30, intra encoder, data reconstrution, discrete cosine transform
PDF Full Text Request
Related items