| As a video postprocessing technology, Frame Rate Up-Conversion(FRUC) puts a low frame rate video into a high frame rate video by inserting an intermediate frame in the original video frame. To improve the video frame rate, Frame Rate Up-Conversion(FRUC) reconstructs new image and inserted into existing video. With high-definition video 2K gradually gaining popularity and the emergence of ultra-high definition 4K video in people’s lives, image resolution and video size are greatly increased. In recent years, accelerating FRUC has become a new hot spot among the video processing technology. Currently most of the frame rate conversions use motion compensation method, divided into two parts of motion estimation and motion compensation. 3-D Recursive Search block-matching(3DRS) has a majority of applications on motion estimation, because of the greater advantage in terms of vectors consistency.With the rise of heterogeneous computing and emerging of many-core parallel computing devices, writing high-performance procedures suiting the majority of computing platforms becomes more complex. From the emergence of a new generation of GPU’s from NVIDIA and AMD, we can see that the computing power of parallel computing devices has been greatly improved. Since 2008, because of the property of good cross-platform, Open CL standard is well popularized.In this thesis, we propose video Frame Rate Up-conversion based on Open CL and analyze its parallel performance on each platform, reaching the portability and real-time requirements of this program. The main work of this thesis is summarized as follows:This thesis analyzes video Frame Rate Up-conversion algorithm. Because 3DRS’s time complexity is high, FRUC algorithm cannot meet the real-time requirements. We propose a parallel 3DRS algorithms based on macroblock(P-3DRS), because of no dependence on calculation of each line’s macro-block. But after running P-3DRS algorithm, we find its efficiency is not high, so we propose two strategies to optimize P-3DRS algorithm as follows. On one hand, by parallel computing each candidate macroblock motion vector to expand the program concurrency, we propose P-3DRS1 algorithm, effectively accelerating parallel 3DRS algorithms based on macroblock. On the other hand, for no dependence when computing macroblock’s similarity, we propose a coarse-grained parallel 3DRS algorithm(P-3DRS2). Finally, a parallel motion compensation algorithm is proposed. We run the above algorithms in NVIDIA Ge Force GTX 970, AMD R9 285, NVIDIA Tesla K20, and obtain the experimental results. Our analysis of the experimental results verifies that the above algorithms are feasible, and their speedups were analyzed.Because Open CL framework has portability in performance, so we carry out P-3DRS1 algorithm and P-MC algorithm which is proposed in this thesis in different parallel computing devices. The proposed algorithm is validated on multicore CPU, GPU and APU. The results show that, depending on the computing power of different computing devices, the running time of P-3DRS1 algorithm and P-MC algorithm varies greatly. After comparing the single-precision floating-point computing power of computing devices, the running time of P-3DRS1 algorithm and P-MC algorithm has a corresponding reduction with the computing capabilities increasing. |