Research On Long-Term Video Prediction Using Taylor Disentanglement

Posted on:2023-09-18

Degree:Master

Type:Thesis

Country:China

Candidate:T Pan

Full Text:PDF

GTID:2558306914481984

Subject:Information and Communication Engineering

Abstract/Summary:

PDF Full Text Request

With the development of the 5G information society and the continuous innovation of technology in computer vision,videos containing rich temporal information have received more attention,and video prediction has gradually become a hot topic in current deep learning research.Video prediction uses a series of historical video frames to predict future video frames.This task is an intermediate step between raw video data and a decision-making system.It can extract potential dynamic evolution patterns from raw video data and has broad application prospects in the fields of meteorology,transportation,and robotics.Mainstream video prediction models can mostly be categorized into three frameworks:extensions of recurrent neural networks,conditioning the prediction on proxy objects,and specific architectures based on factorized prediction space.Unfortunately,existing works for video prediction fail to trade off short-term and long-term prediction performances and extract robust latent dynamics laws in video frames.In response to the above problems,the main work and innovations of this paper are summarized as follows:(1)A novel principle for feature separation,Taylor feature separation,is proposed.Taylor series is an important approximation method in physics.The Taylor feature separation is inspired by the Taylor series,which is mathematically explicable,different from explicit feature separation consistent with human intuition,such as foreground and background.This separation mode contains a mathematical prior,reducing the difficulty of feature separation and making dynamic modeling easier.Furthermore,the Taylor series applies to any differentiable function,so Taylor prior is also applicable to complex,chaotic systems.(2)Based on the above principle,a novel recurrent prediction module(TaylorCell)is proposed,which contains the Taylor prediction unit(TPU)and the memory correction unit(MCU).TPU only employs finite derivatives of the first input frame to predict the future frames for avoiding error accumulation;MCU corrects the predicted Taylor feature from TPU by distilling information of all past frames through the gating mechanism.(3)Integrating TaylorCell into the two-branch model,the paper proposes a novel video prediction model TaylorNet.Taylor series owns the characteristic of the further away from the expansion point,the greater the approximate error.Therefore,the proposed TaylorNet is primarily suitable for long-term than ultra-long-term prediction and works better on datasets with short-range spatial dependencies and stable dynamics.Moreover,TaylorNet has a small number of parameters.In three general datasets,TaylorNet reaches the state-of-the-art model in the short-term forecast and outperforms them in the long-term forecast.

Keywords/Search Tags:

video prediction, feature separation, deep learning, spatiotemporal sequence prediction

PDF Full Text Request

Related items

1	Design And Implementation Of Spatiotemporal Sequence Prediction Algorithm For Perceived Quality In Mobile Networks
2	Citywide Crowd Flow Prediction Based On Deep Networks With Spatiotemporal Attention
3	Research On Video Frame Sequence Prediction Algorithm Based On Deep Learning
4	Video Scene Prediction Based On Deep Learning
5	Research On Short-term Precipitation Prediction Based On Spatiotemporal Prediction Network And Convolutional Neural Networ
6	The Local SVR And Its Applications In The Prediction Of Spatiotemporal Chaos Sequence
7	Research On Shopping Mall Choice Prediction Of Consumers Based On Spatiotemporal Data
8	Research On Weather Radar Echo Extrapolation Method Based On Deep Spatiotemporal Series Predictio
9	Research On Analysis And Prediction Of Spatiotemporal Big Data
10	Deep Learning For City-scale Wireless Traffic Prediction