Please use this identifier to cite or link to this item:

http://hdl.handle.net/10609/123286
Title: Folded recurrent neural networks for future video prediction
Author: Oliu Simon, Marc
Selva, Javier
Escalera Guerrero, Sergio
Others: Universitat Oberta de Catalunya (UOC)
Universitat de Barcelona
Keywords: future video prediction
unsupervised learning
recurrent neural networks
Issue Date: 9-Oct-2018
Publisher: European Conference on Computer Vision
Citation: Oliu-Simón, M., Selva, J. & Escalera Guerrero, S. (2018). Folded recurrent neural networks for future video prediction. Lecture Notes in Computer Science, 11218, 745-761. doi: 10.1007/978-3-030-01264-9_44
Published in: European Conference on Computer Vision ECCV, Munich, Germany, 8-14 September, 2018
Project identifier: info:eu-repo/grantAgreement/TIN2016-74946-P
Also see: https://doi.org/10.1007/978-3-030-01264-9_44
Abstract: This work introduces double-mapping Gated Recurrent Units (dGRU), an extension of standard GRUs where the input is considered as a recurrent state. An extra set of logic gates is added to update the input given the output. Stacking multiple such layers results in a recurrent auto-encoder: the operators updating the outputs comprise the encoder, while the ones updating the inputs form the decoder. Since the states are shared between corresponding encoder and decoder layers, the representation is stratified during learning: some information is not passed to the next layers. We test our model on future video prediction. Main challenges for this task include high variability in videos, temporal propagation of errors, and non-specificity of future frames. We show how only the encoder or decoder needs to be applied for encoding or prediction. This reduces the computational cost and avoids re-encoding predictions when generating multiple frames, mitigating error propagation. Furthermore, it is possible to remove layers from a trained model, giving an insight to the role of each layer. Our approach improves state of the art results on MMNIST and UCF101, being competitive on KTH with 2 and 3 times less memory usage and computational cost than the best scored approach.
Language: English
URI: http://hdl.handle.net/10609/123286
ISSN: 0302-9743MIAR
Appears in Collections:Articles

Share:
Export:
Files in This Item:
File SizeFormat 
1712.00311.pdf2.05 MBAdobe PDFView/Open

Items in repository are protected by copyright, with all rights reserved, unless otherwise indicated.