Font Size: a A A

CUDA-based Inter-frame Prediction Optimization And Parallelization

Posted on:2015-10-15Degree:MasterType:Thesis
Country:ChinaCandidate:P C WangFull Text:PDF
GTID:2308330452455848Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
As the main module of H.264/AVC, inter frame prediction mainly used to remove thetemporal redundancy in video sequence to improve compression rate. While the wholemodule is time-consuming, high resource utilization also makes it become the bottleneck ofperformance improvement. Meanwhile, with the improvement of GPU computing capacityand the mature CUDA(Compute Unified Device Architecture) platform, more and morecomputing-intensive applications has been migrated to GPU. As the bottleneck of the H.264,considering how to use CUDA to accelerate the inter-frame prediction module has becomethe hot topic in the field of video compression and high performance computing.For the reason that high data dependency in traditional motion estimation algorithm, itis difficult to adapt to CUDA SIMT calculation model. On the other hand, through theexperiment on different kinds of video data, there exists high correlation between codingdata of inter-layer. Based on this, we optimized the module from four aspects asfollows:(1)re-organize the work flow of the inter-frame prediction module to make it moreproper to CUDA;(2)propose motion tendency oriented motion estimation algortihm tomake full use of computing resources on CUDA which is caused by strong data dependencyon fast search algorithm;(3)propose and realize preliminary search mechanism based ondomain partition and matching of sampling to reduce single thread computing load andmake full use of neighborhood information;(4)put forwad and realize model mergingalgorithm based on inter-layer prediction as a result of inter-scale dependency of motionvector.As the experimental result shows that, when compared with the full search algorithmthe adaptive iterative search algorithm can achieve70~80times speedup and ensure codedframe SNR loss under0.5dB at the same time. When compared with the mainstream fullsearch algorithm, the proposed algorithm not only can lift speed more than3times, it canalso achieve better coding effect. When compared with classic CUDA-based motionestimation algorithm, the ratio can achieve about20%and coded frame SNR loses under0.5dB at the same time.
Keywords/Search Tags:Video Compression, CUDA, Iter-frame prediction, Adaptive motion estimation
PDF Full Text Request
Related items