VideoAutoEncoder

It is a small experiment to create an efficient Video Autoencoder for graphics with little VRAM memory.

Dataset's used:

Refactor: https://huggingface.co/datasets/Fredtt3/Videos Original: https://huggingface.co/datasets/lmms-lab/VideoDetailCaption

AdaptiveEfficientVideoAutoencoder (Version 0.3.0)

The new AdaptiveEfficientVideoAutoencoder offers a Video AutoEncoder that can have different qualities or durations. Currently, tests and improvements are being done on this autoencoder. We have noticed that it takes time to learn to rebuild depending on the quality and duration.

All information regarding VideoAutoEncoder usage and training is in the Test folder.

Memory Usage at 480p 5s videos at 15fps

Reconstruction at 480p 5s 15fps

Memory Usage Comparison

Version 0.1.0

RAM

VRAM

Version 0.2.0

RAM

VRAM

Version 0.3.0

You can now train from a Colab for 240p 10s videos at 15fps

Installation

git clone https://github.com/Rivera-ai/VideoAutoencoder.git
cd VideoAutoencoder
pip install -e .

Installation via Pypi

pip install VideoAutoencoder

Training Results V0.1.0

Epoch 0 Reconstruction Progress

The following demonstrations show the reconstruction quality at different steps during the first epoch of training:

Step 0

Step 50

Step 100

Step 150

Step 200

Training Results V0.2.0

Epoch 0 Reconstruction Progress

The following demonstrations show the reconstruction quality at different steps during the first epoch of training:

Step 0

Step 200

Epoch 1

Step 450

Epoch 2

Step 650

Epoch 3

Step 850

Epoch 4

Step 1050

Obviously, training it on larger datasets and for more epochs will yield better results in terms of reconstruction and version 0.2.0 is much better optimized to train even on 3GB of VRAM but at the cost of requiring more epochs and training steps.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

VideoAutoEncoder

Dataset's used:

AdaptiveEfficientVideoAutoencoder (Version 0.3.0)

Memory Usage at 480p 5s videos at 15fps

Reconstruction at 480p 5s 15fps

Memory Usage Comparison

Version 0.1.0

RAM

VRAM

Version 0.2.0

RAM

VRAM

Version 0.3.0

You can now train from a Colab for 240p 10s videos at 15fps

Installation

Installation via Pypi

Training Results V0.1.0

Epoch 0 Reconstruction Progress

Step 0

Step 50

Step 100

Step 150

Step 200

Training Results V0.2.0

Epoch 0 Reconstruction Progress

Step 0

Step 200

Epoch 1

Step 450

Epoch 2

Step 650

Epoch 3

Step 850

Epoch 4

Step 1050

Files

README.md

Latest commit

History

README.md

File metadata and controls

VideoAutoEncoder

Dataset's used:

AdaptiveEfficientVideoAutoencoder (Version 0.3.0)

Memory Usage at 480p 5s videos at 15fps

Reconstruction at 480p 5s 15fps

Memory Usage Comparison

Version 0.1.0

RAM

VRAM

Version 0.2.0

RAM

VRAM

Version 0.3.0

You can now train from a Colab for 240p 10s videos at 15fps

Installation

Installation via Pypi

Training Results V0.1.0

Epoch 0 Reconstruction Progress

Step 0

Step 50

Step 100

Step 150

Step 200

Training Results V0.2.0

Epoch 0 Reconstruction Progress

Step 0

Step 200

Epoch 1

Step 450

Epoch 2

Step 650

Epoch 3

Step 850

Epoch 4

Step 1050