| In recent years,users' demands for media consumption experience have gradually increased.The Immersive media era is coming.As one of the most important immersive VR media,omnidirectional media also comes into a stage of rapid development.Based on the fact that only the content within FoV is actually viewed when comsuming the omnidirectional video,FoV-based transmission scheme is generated to transmit the content within the FoV of high quality,and reduce the content quality outside the FoV region,thereby reducing the overall amount of data transmitted.In order to cope with the problem that the transmitted FoV content does not completely match the actual FoV of the user casused by the system delay,the prediction of the user's FoV previously becomes indispensable process in the omnidirectional media application system when utilizing FoV-based transmission scheme.Different FoV-based transmission schemes propose a differentiated demand for FoV prediction,which encourage the development of sub-picture FoV prediction and viewpoint FoV prediction.How to capture the user viewing behavior rules and establish a prediction model that reflects these laws to improve the accuracy of FoV prediction is becoming the main challenge of this problemThis study focuses on the viewpoint prediction of omnidirectional video and the sub-picture FoV prediction of FoV.The FoV viewpoint prediction model based on the user's head rotation speed is established.The hidden Markov model(HMM)and the Gaussian mixture model,(GMM)are established to establish the relationship between the user's head rotation speed and particular hidden states.A further model is designed to describe the correlation between the change of the hiden states and the change of the rotation speed of the user.Based on this information,the viewpoint information of the FoV can be predicted at last.This prediction model shows improvement in prediction accuracy in the simulation experiment.Secondly,in the existing scheme of sub-picture FoV prediction using LSTM network,different types of input features are stacked without fully exploiting user viewing behavior rule.The target of model training do not meet the practical application scenarios.We propose an improved sub-picture FoV prediction model based on LSTM network.According to the characteristics that the user's viewing behavior has strong correlation in a short time and the user's viewing behavior is influenced by the omnidirectional video content,the feature extraction of the omnidirectional content and the user's viewing behavior is proposed.A further content feature processing procedure is designed to represent the dynamic correlation between the input and the content features.At the same time,it is proposed to make the output and the traing target cover the user's FoV within a certain time range.Then a particular scheme to decide a definite FoV region is designed.The improvement of existing methods is achieved in both prediction accuracy and F-score performance. |