Font Size: a A A

Research On Video Reconstruction Based On Deep Convolutional Neural Network And Spatial-temporal Feature

Posted on:2018-03-09Degree:MasterType:Thesis
Country:ChinaCandidate:L H LiFull Text:PDF
GTID:2348330518495568Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
Limited by the signal interference, pollution and lost in the environment, it is difficult to provide enough information for the field such as spacecraft docking and medical analysis. Aiming at handling these problems, it is of significance in theory and practice to reconstruct videos, which is a basic problem in computer vision. Conventional image reconstruction algorithms have many disadvantages, such as high computing complexity, poor real-time performance and generalization ability. They cannot receive ideal performance for complex motion and servere interference. It is an urgent problem to enhance the video resolution and clarity with the ample utilization of prior information and effectively mining of external datasets. After researching on the traditional denoising and super-resolution algorithm, we improved the video reconstruction algorithm combined with convolutional neural network and video spatio-temporal feature. The main work of this thesis is as follows:(1) A video denosing algorithm based on residual neural network(ResDN) is proposed to solve the problem that traditional algorithm has poor generalization ability for complex noise. The low-rank matrix is constructed with spatial-temporal similarity. After decompositing the low-rank matrix, sparse dictionary is generated, which is utilized to filter the coarse-grain noise. As the noise and the tiny texture both have smaller sparse coder, they exist in the filtered residual. Convolutional neural network is employed to extract and enhance the filted texture. This operation helps to maintain more useful information, recover a video with higher quality and clarity, and enhance the analysis and utilization value of the target video. Experimental results demonstrate that the proposed algorithm receives better denoise performation both on the objective evaluation index and subjective visual effect. Compared with BRFOE,SSC_GSM and NCSR, ResDN increases 24%, 2.0%, 9.8% on PSNR,65.1%, 3.5%, 24.2% on SSIM, 14.5%, 2.6%, 7.4% on FSIM respectively.On the index of RMSE, ResDN decreases 41.9%, 42.4%, 39.9%compared to the three contrast algorithms.(2) In order to solve the poor real-time problem of the convential sparse-based algorithm, make full use of the prior information of the video itself, a video super-resolution reconstruction algorithm(STCNN-SR) is proposed. A correlation mapping model is established with convolutional neural network, as it has excellent image representation ability. The sparse coding and decoding process can be replaced by the front propogation of network. This improvement can help to reconstruct video detail rapidly and effectively. Moreover, moment feature and structure feature is utilized to quantify the nonlocal similarity in video, with which we can mix the prior information with high reliability into the mapped result. This process can optimize the miss match and inconsistent patches, enhance the reconstruction performance.Experiments demonstrate that the proposed algorithm receives better denoise performation both on the objective evaluation index and subjective visual effect. Compared with ScSR, NL-SR, CNN-SR, DPSR,Video Upsampling and ZM-SR, STCNN-SR increases 9.7%, 6.4%, 8.9%,8.5%, 9.0%, 6.2% on WSNR, 4.3%, 2.3%, 4.3%, 4.2%, 3.1%, 1.6% on MS-SSIM, 4%, 2.3%, 4.6%, 4.3%, 3.7%, 1.8% on FSIM respectively. On the index of RMSE, ResDN decreases 17.7%, 17.0%, 19.1%, 17.0%,16.2% and 9% compared to the three contrast algorithms.(3) A video super-resolution reconstruction algorithm based on patch similarity and convolutional neural network (STCP-SR) is proposed to solve the problem of noise amplification in network based reconstruction algorithm. Apart from extracting feature descriptor in video, Gaussian distribution feature is extracted to associate the reconstructed image patch with the external patch-sets. Apart from making full use of the prior knowledge in video, we can compensate details and suppress noise with the corresponding information, which improves the robustness of the whole reconstruction algorithm. Structural sparse dictionary is established with the patches following the same Gauss distribution. Through coding and decoding process, the noise is suppressed and some miss information can be rebuilt. As a result, the proposed algorithm achieves good reconstruction performance in both objective evaluation and subjective visual effects proved by experiments. Compared with ScSR, DPSR,CNN-SR, CSCN, LASSC, NL-SR and VideoSR algorithm, STCP-SR increases 7.2%, 6.8%, 7.7%, 7.4%, 5.6%, 4.8% and 6.1% on PSNR, 5.6%,5.5%, 5.7%, 5.9%, 4%, 3.1% and 3.3% on SSIM, 4%, 3.4%, 4.1%, 4.2%,3.7%, 2.8% and 3.1% on FSIM respectively. On the index of RMSE,ResDN decreases 18%, 17.3%, 19.6%, 18%, 16.5%, 16.5% and 16.5%compared to the three contrast algorithms.(4) A video reconstruction system based on deep convolutional neural network and spatial-temporal feature has been designed and implemented. It consists of three main function modules: video preprocessing, video denoising and video super-resolution module. An index evaluation interface is also provided by this system to analyse and evaluate the proposed algorithms. The evaluation indexes have two catagories, which are video quality evaluation and video reference evaluation. Experiments demonstrate that the STNN-RS system work well, achieve better video denoising and super-resolution reconstruction effect and realize the enhancement of input video quality, which is a powerful verification of the algorithms mentioned in this thesis.
Keywords/Search Tags:video reconstruction, deep convolutional neural network, spatial-temporal feature, sparse dictionary
PDF Full Text Request
Related items