| With the rapid development of network communication and digital media,digital video has become the main news media in today’s society.In order to meet the application and popularization of UHD video,several video standardization organizations have proposed a new generation of video coding standards,which greatly improves the compression rate of video coding and effectively solves the bandwidth problem in the process of UHD video transmission.However,the new coding standard needs to read a large amount of reference data to restore the video image in the motion compensation process of inter frame decoding,which will increase the consumption of video decoding time and bandwidth.In order to further improve the efficiency of video decoding,reducing the bandwidth occupation of inter frame decoded data and improving the decoding speed are the research focus of hardware decoder IP design.Aiming at the problems existing in the existing motion compensation reading architecture,this paper designs a new motion compensation reference frame data reading solution based on the SOC integrated hardware video decoder IP architecture.Firstly,the frame buffer compression technology is adopted in the storage of reference frame,that is,the reference frame is lossless compressed by differential coding,and the reference frame is written into memory after compression,which can reduce the read-write bandwidth of reference frame;Secondly,multi-level cache is added to the motion compensation reading module.According to the data structure characteristics of frame cache compression technology,on the basis of comprehensively considering the cache hit rate and chip area,cache and cacheline with different capacity are designed for header data and original pixel data,so as to optimize the cache refresh strategy,reduce the occupation of bandwidth by repeated requests,and achieve the purpose of improving the efficiency of video decoding.According to the proposed motion compensation reading scheme,the hardware architecture is designed.Based on the division of sub modules,the hardware description language is used to realize the circuit of the architecture.Through simulation analysis of the design blind spot of cache refresh strategy and the characteristics of video coding,the hit rate of cache is further improved through two-stage optimization.Amve verification platform is used to verify the correctness of the design.Finally,the chip is streamed on the tsmc-12 nm process platform.After the completion of the chip,the performance test of the chip is completed.The test results show that compared with the traditional architecture,the inter decoding reading bandwidth is reduced by 30% on average,and the decoding speed is increased by10% on average.Both indicators are better than the traditional architecture.When the main frequency of a single core is 800 MHz,the decoding speed can reach 4K120 fps / 8K30 fps,Meet the design expectations. |