| At present,video has become the main carrier of information transmission,and more and more users create their own videos and upload them to social media.Due to the varying levels of user hardware equipment,photography,and the complex and changing shooting environment,the video shooting process may introduce varying degrees of shake,overexposure,out-of-focus and other factors that affect quality.More degradation factors will be introduced in post-processing,compression and transmission,leading to the uneven quality of videos on major video websites and social media.The demand for objective quality assessment of in-the-wild videos is increasing day by day.There are two main difficulties in evaluating the quality of user-created videos:one is that it is difficult to obtain reference videos without damage;the other is that such videos are often affected by multiple quality degradation factors.To this end,in this paper,the objective quality assessment algorithms for in-the-wild videos without reference based on deep learning theory is studied to facilitate quality control by video content creators and video content distribution platforms and to improve user viewing experience.The main work and innovations of the paper are as follows:(1)To solve the problem that existing algorithms are not sufficient to extract low-level semantics of videos,a no-reference video quality assessment method MSTVQA based on multi-scale spatio-temporal feature extraction is proposed.Firstly,pretrained models are used to extract low-level semantics and high-level semantics of videos to construct multi-scale spatial features and motion features,which are combined with channel attention mechanism to further enhance the expressiveness of key features.Secondly,a bidirectional GRU model is used to construct long-term dependencies of spatio-temporal features.Finally,averaging pooling strategy is used to compute the quality scores of videos.In this paper,single-dataset and cross-dataset experiments are carried out on four public datasets,and the experimental results show that the model proposed by this paper outperforms other existing representative models in terms of evaluation metrics and has good generalization ability.(2)To improve the consistency of the algorithm scoring results with human subjective quality scores,a no-reference video quality assessment method STTPVQA based on human visual perceptual characteristics is proposed in this paper.The algorithm is distinguished by using the visual saliency and time lag effects of the human visual system to improve the prediction accuracy.In terms of visual saliency,the algorithm first uses a saliency object detection model to extract the salient objects in the video,then extracts the multi-scale spatial features of the salient objects,and finally fuses them with the spatio-temporal features of the video to obtain the multi-scale spatio-temporal features of the video with saliency information.In terms of time lag effect,in the paper,the time lag effect in human subjective experiments is modeled,and the frame-level quality scores are mapped to video-level quality scores using a time pooling strategy.Experimental results on four public datasets show that the proposed method has a higher consistency with human perception scores.The results of the ablation experiments also demonstrate the effectiveness of each module of the algorithm.(3)Based on the algorithm proposed above,a platform for objective video quality evaluation is designed and implemented in this paper.The platform is based on browser/server architecture,which can evaluate the objective quality of user uploaded videos,and also provides functions such as video playback and parameter display. |