fast_imagenet_ffcv

Fast ImageNet training code with FFCV

Setting up

Installation: install pytorch and ffcv (https://github.com/libffcv/ffcv).

Run via python train_ffcv.py.

Notes

The datasets are already preprocessed (available in /self/scr-sync/nlp/imagenet_ffcv on each NLP cluster machine). Training is fast bc entire ImageNet is cached in memory. Tested on 4 3090s, it takes 10 hours to train a ResNet50 (~10 epochs/hour for 100 epochs). On 1 3090, it takes 36 hours. On 4 A100s, it takes 5 hours. Accuracy for a ResNet50 on 4 gpus with batch size 256 per gpu (default params) is 75.4%.

Slurm advice

Since dataset itself is ~70G, I would request 100G for the job to be safe. I would also request # of cpus = 2 * (num_workers + 1) * num_gpus, so 40 cpus for 4 gpus. This is since each gpu requires num_workers dataloading threads and one thread for the model. Multiply by 2 since the dataloading transforms are JITed so hyperthreading doesn't give much perf. This means for the machines with only 32 slurm cpus, probably best to use 3 workers per gpu (but you should try for yourself). Otherwise, I found 4 workers per gpu with total # cpus request 40 to give fast speeds on the 3090 setup.

How I wrote this

This is essentially just the standard PyTorch ImageNet training script (https://github.com/pytorch/examples/blob/main/imagenet/main.py) with a modified dataloader. The meat of the dataloader bits are here: https://github.com/tatsu-lab/fast_imagenet_ffcv/blob/main/train_ffcv.py#L123-L154.

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
.gitignore		.gitignore
README.md		README.md
train_ffcv.py		train_ffcv.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

fast_imagenet_ffcv

Setting up

Notes

Slurm advice

How I wrote this

About

Releases

Packages

Languages

tatsu-lab/fast_imagenet_ffcv

Folders and files

Latest commit

History

Repository files navigation

fast_imagenet_ffcv

Setting up

Notes

Slurm advice

How I wrote this

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages