TY - JOUR
T1 - A Temporally-Aware Interpolation Network for Video Frame Inpainting
AU - Szeto, Ryan
AU - Sun, Ximeng
AU - Lu, Kunyi
AU - Corso, Jason J.
N1 - Publisher Copyright:
© 1979-2012 IEEE.
PY - 2020/5/1
Y1 - 2020/5/1
N2 - In this work, we explore video frame inpainting, a task that lies at the intersection of general video inpainting, frame interpolation, and video prediction. Although our problem can be addressed by applying methods from other video interpolation or extrapolation tasks, doing so fails to leverage the additional context information that our problem provides. To this end, we devise a method specifically designed for video frame inpainting that is composed of two modules: A bidirectional video prediction module and a temporally-Aware frame interpolation module. The prediction module makes two intermediate predictions of the missing frames, each conditioned on the preceding and following frames respectively, using a shared convolutional LSTM-based encoder-decoder. The interpolation module blends the intermediate predictions by using time information and hidden activations from the video prediction module to resolve disagreements between the predictions. Our experiments demonstrate that our approach produces smoother and more accurate results than state-of-The-Art methods for general video inpainting, frame interpolation, and video prediction.
AB - In this work, we explore video frame inpainting, a task that lies at the intersection of general video inpainting, frame interpolation, and video prediction. Although our problem can be addressed by applying methods from other video interpolation or extrapolation tasks, doing so fails to leverage the additional context information that our problem provides. To this end, we devise a method specifically designed for video frame inpainting that is composed of two modules: A bidirectional video prediction module and a temporally-Aware frame interpolation module. The prediction module makes two intermediate predictions of the missing frames, each conditioned on the preceding and following frames respectively, using a shared convolutional LSTM-based encoder-decoder. The interpolation module blends the intermediate predictions by using time information and hidden activations from the video prediction module to resolve disagreements between the predictions. Our experiments demonstrate that our approach produces smoother and more accurate results than state-of-The-Art methods for general video inpainting, frame interpolation, and video prediction.
KW - Video inpainting
KW - frame interpolation
KW - temporal upsampling
KW - video prediction
UR - http://www.scopus.com/inward/record.url?scp=85074835836&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85074835836&partnerID=8YFLogxK
U2 - 10.1109/TPAMI.2019.2951667
DO - 10.1109/TPAMI.2019.2951667
M3 - Article
C2 - 31714216
AN - SCOPUS:85074835836
SN - 0162-8828
VL - 42
SP - 1053
EP - 1068
JO - IEEE Transactions on Pattern Analysis and Machine Intelligence
JF - IEEE Transactions on Pattern Analysis and Machine Intelligence
IS - 5
M1 - 8892406
ER -