| Virtual reality(VR),as an emerging technology,integrates and applies multimedia,sensors,new displays,the internet,and artificial intelligence in many fields,aiming to build a harmonious human-computer interaction environment through smart devices and provide people with immersive visual,auditory,tactile and even olfactory experiences.With the rapid development of 5G technology and the iterative update of the industry,virtual reality technology is becoming more mature and widely used in various industries.As the main form of virtual reality,omnidirectional video can record natural scenes or computer-rendered scenes with a full field of view of 360°× 180°.With the help of head-mounted display devices(HMD),users can watch the entire virtual scene by themselves and explore the parts they are interested in,so the viewing behavior of users varies from person to person.A good immersive experience puts forward higher requirements for the quality of the omnidirectional video.However,the quality of the current omnidirectional video varies from good to bad.The introduction of noise and abnormal shooting movements will make users feel dizzy and fatigued,which will greatly damage users’ viewing experience.Therefore,how to evaluate the quality of the omnidirectional video,and guide and supervise the key technical links such as compression,transmission,optimization,and content editing of omnidirectional video,so as to improve the viewing experience of users,is the key to the development of related technologies.We conducted a comprehensive study of the visual quality of omnidirectional video from both subjective and objective aspects.The research contents are as follows.(1)We start with the viewing behavior of users in the VR environment.In order to study the conditional factors that affect viewing behavior(such as starting point and exploration time),clarify the mechanism of viewing conditions on viewing behavior,and explore the impact of viewing behavior on perceived quality.We constructed a large-scale omnidirectional video subjective quality evaluation database.The database contains 502 original videos downloaded from websites,involving a variety of content scenes,and the distortion types are all authentic distortions.Based on the database,we conducted a subjective experiment,set different viewing conditions,and collected user subjective quality ratings and viewing behavior data under the corresponding viewing conditions.Through statistical analysis of subjective experimental data,we made the conclusion that viewing conditions are important factors affecting viewing behavior and perceived quality.(2)Inspired by the data analysis of subjective experiments and omnidirectional video viewing characteristics,we propose a no-reference omnidirectional video quality assessment based on user viewing behavior,and train and test on the proposed database.The model consists of four modules: user behavior modeling,visual quality feature extraction,temporal feature modeling,and perceptual quality prediction.According to the analysis of experimental data,we consider three kinds of user viewing behaviors: static scanning path,equator setting path,and real viewing scanning path,and quantifies them.After the projection space conversion process,the viewport sequences are extracted from the spherical area,for simulating the process of users exploring omnidirectional videos.Then,we use the Inception ResNet-V2 pre-trained on ImageNet as the spatial feature extractor to make full use of perceptual quality-related spatial features.Furthermore,we use Gated Recurrent Units to establish long-term dependencies on feature vectors to obtain quality scores for individual viewports,followed by a temporal hysteresis pooling strategy to obtain overall quality scores representing omnidirectional videos.Experimental performance results on the proposed database fully validate the effectiveness of the proposed model. |