It is a small experiment to create an efficient Video Autoencoder for graphics with little VRAM memory.
Refactor: https://huggingface.co/datasets/Fredtt3/Videos Original: https://huggingface.co/datasets/lmms-lab/VideoDetailCaption
The new AdaptiveEfficientVideoAutoencoder
offers a Video AutoEncoder that can have different qualities or durations. Currently, tests and improvements are being done on this autoencoder. We have noticed that it takes time to learn to rebuild depending on the quality and duration.
All information regarding VideoAutoEncoder usage and training is in the Test
folder.
git clone https://github.com/Rivera-ai/VideoAutoencoder.git
cd VideoAutoencoder
pip install -e .
pip install VideoAutoencoder
The following demonstrations show the reconstruction quality at different steps during the first epoch of training:
The following demonstrations show the reconstruction quality at different steps during the first epoch of training:
Obviously, training it on larger datasets and for more epochs will yield better results in terms of reconstruction and version 0.2.0 is much better optimized to train even on 3GB of VRAM but at the cost of requiring more epochs and training steps.