Font Size: a A A

Research On Methods For Multi-Source Image And Video Perceptual Quality Assessment

Posted on:2022-04-02Degree:DoctorType:Dissertation
Country:ChinaCandidate:W ZhouFull Text:PDF
GTID:1488306323462864Subject:Information and Communication Engineering
Abstract/Summary:PDF Full Text Request
Images and videos are significant information sources for human visual percep-tion and machine pattern recognition,and their perceptual quality plays a crucial and decisive role in the accuracy and adequacy of obtained information.In our daily life,we humans need to process a large amount of visual information every day.Before re-ceiving such visual information,multimedia data usually need to go through the whole information processing chain,including acquisition,compression,transmission,recon-struction,and display,among many others.In this procedure,the original multime-dia data undergo various perceptual quality degradation at different processing stages.Thus,how to effectively predict and optimize the user quality of experience for pro-cessed images and videos is an open and important research topic.At the same time,with the continuous development of display technology and equipment,stereoscopic image/video,panorama,super-resolution and other multime-dia information have become increasingly popular.For example,the emerging 3D-TV and head-mounted display devices can provide viewers with a new visual viewing ex-perience that is different from 2D flat contents.However,there still exist many quality issues when viewing images and videos in different formats.Therefore,this dissertation studies the methods for multi-source image and video perceptual quality assessment.The main contents of the research work are listed as follows:(1)Design 2D image and video quality assessment methods based on deep learn-ing.Since the visual perception of the human visual system is based on multi-scale information,this dissertation applies the pyramid feature learning method on distorted images as well as residual images of the luminance channel to build an end-to-end deep neural network that can learn multi-scale features to predict distorted 2D image qual-ity.The proposed model can achieve better performance than existing image quality assessment methods on four public image subjective quality assessment databases.In addition,compared with image quality assessment,video quality assessment needs to consider not only the spatial distortions,but also the complex temporal characteristics.A pre-trained deep learning model is used to extract local and global spatial-temporal features from video frames and adjacent frame difference maps for aggregation.Exper-imental results show that this method can effectively predict perceptual video quality.(2)Establish the subjective quality assessment database based on the latest stereo-scopic video coding standard.Compared with traditional 2D content quality assess-ment,stereoscopic content quality assessment should not only consider image quality,but also evaluate other quality dimensions such as depth perception.In order to study the relationship between multi-dimensional quality and the key factors affecting each quality dimension,based on the latest 3D-HEVC video coding standard,original stereo-scopic videos are compressed with different levels of image quality,accompanied by various depth perception quality.Then,these stereoscopic videos are randomly viewed by subjects to provide ratings to establish the subjective quality assessment database.Through the analysis of the built database,the effects of different distortion levels and depth ranges on the visually perceptual quality of various 3D video contents are studied.(3)Propose no-reference stereoscopic video quality assessment algorithms based on hand-crafted features.First,this dissertation proposes a bitstream-level no-reference objective stereoscopic video quality assessment model,which extracts some relevant features from 3D compressed video bitstreams and predicts the perceptual quality of 3D videos through the support vector regression model.Experimental results demonstrate that,compared with existing pixel-level full-reference 2D and 3D objective quality as-sessment methods,the proposed algorithm achieves significant performance improve-ment.Second,image quality and depth perception quality are modeled respectively to describe different aspects of the overall 3D quality.According to the response of the human visual cortex to 3D visual signals,a depth perception quality assessment model is proposed,which is verified on the established subjective quality assessment database based on the latest 3D video coding standard.Then,the key part of the proposed depth perception model is extended to the overall quality assessment of stereoscopic videos.The fusion natural scene statistic measurement and auto-regressive prediction based dis-parity entropy measurement are proposed on the binocular summation and subtraction channels,respectively.Experiments on three publicly available databases show that the proposed method is effective on both symmetrically and asymmetrically distorted stereoscopic videos with different distortion types.(4)Design no-reference stereoscopic content quality assessment methods based on two-stream deep convolutional neural networks.The remarkable ability of deep convo-lutional neural networks to learn discriminative features provides a promising solution for no-reference image and video quality assessment.Considering that the stereoscopic image consists of two 2D images for left and right views,the two-stream deep con-volutional neural network can use left and right 2D images as inputs.Based on the binocular vision mechanism in the human visual system,this dissertation proposes the dual-stream interactive deep convolutional neural network.The end-to-end training and learning of image patch pairs are carried out to obtain the perceptual quality of stereoscopic images.Due to the relatively small scale of existing stereoscopic video subjective quality assessment databases,it is necessary to preprocess distorted stereo-scopic videos.Specifically,we extract the intermediate frames of distorted stereoscopic videos in the temporal domain as the key frames through three steps,so as to generate the corresponding sub-sequence of videos.The key frames are then divided into im-age patches and input into the two-stream deep convolutional neural network to learn the perceptual quality of stereoscopic videos.Experimental results show that the pro-posed two networks achieve better performance than existing methods on the subjective quality assessment databases of stereoscopic images and stereoscopic videos.(5)Propose no-reference panoramic image quality assessment algorithms.Based on the frequency-dependent characteristics of the human visual system and the view-ing process of panoramic images,a no-reference panoramic image quality assessment method is proposed by combining multi-frequency information and image naturalness.First,the low-frequency and high-frequency subbands are obtained by decomposing dis-torted panoramic images,and the entropy intensities of these subbands are calculated to reflect the multi-frequency information of panoramic images.Then,global and local naturalness features are extracted from the equirectangular projection maps and vari-ous viewports.Finally,multi-frequency and naturalness features are fused to obtain the perceptual quality of panoramic images.The superiority of this model is demonstrated on two publicly available panoramic image subjective quality assessment databases.(6)Design perception-oriented super-resolution image quality evaluation methods.Since different super-resolution algorithms can generate reconstructed images with var-ious qualities,full-reference and no-reference quality assessment algorithms for super-resolution images are designed in this dissertation.The full-reference method proposes to conduct the two-dimensional quality evaluation for super-resolution images from the aspects of structural fidelity and statistical naturalness,and linearly fuse them to obtain the perceived quality of super-resolution images.For the no-reference method,accord-ing to the structure and texture distortion characteristics of super-resolution images,image data are generated by the adaptive cropping of image patches and input into the designed deep neural network.Experimental results show that the proposed methods are superior to existing image quality assessment algorithms on public databases.
Keywords/Search Tags:Image and video, 3D, Panorama, Super-resolution, Video coding, Subjective and objective quality assessment, Human visual system, Neural networks
PDF Full Text Request
Related items