diff --git a/README.md b/README.md index 9042ed5..4aee451 100644 --- a/README.md +++ b/README.md @@ -1,121 +1,27 @@ -# MLPlayground +# MIM-Refiner -## Setup +Pytorch implementation of MIM-Refiner. -#### environment -`conda env create --file environment_.yml --name ` -this will most likely install pytorch 2.0.0 with some old cuda version -> install newer cuda version -`pip install torch==2.0.0+cu117 torchvision==0.15.0+cu117 --index-url https://download.pytorch.org/whl/cu117` -`pip install torch==2.0.0+cu118 torchvision==0.15.0+cu118 --index-url https://download.pytorch.org/whl/cu118` -`pip install torch==2.1.1+cu121 torchvision==0.16.1+cu121 --index-url https://download.pytorch.org/whl/cu121` +# Pre-trained Models -you can check the installed version with: -``` -import torch -torch.__version__ -torch.version.cuda -``` - -#### Optional: special libraries - -- `pip install cyanure-mkl` (logistic regression; only available on linux; make sure you are on a GPU node to install) - -#### configuration - -#### static_config.yaml - -choose one of the two options: - -- copy a template and adjust values to your setup `cp template_static_config_iml.yaml static_config.yaml` -- create a file `static_config.yaml` with the first line `template: ${yaml:template_static_config_iml}` - - overwrite values in the template by adding lines of the format `template.: ` - - `template.account_name: ` - - `template.output_path: ` - - add new values - - `model_path: ` - - example: - ``` - template: ${yaml:template_static_config_iml} - template.account_name: - template.output_path: /system/user/publicwork/ssl/save - template.local_dataset_path: /localdata - ``` - -#### optional configs configs -- create wandb config(s) (use via `--wandb_config ` in CLI or `wandb: .yaml` - - adjust values to your setup -- create a default wandb config (this will be used when no wandb config is defined) - - `cp template_wandb_config.yaml wandb_config.yaml` - - adjust values to your setup -- create `sbatch_config.yaml` (only required for `main_sbatch.py` on slurm clusters) - - `cp template_config_sbatch.yaml sbatch_config.yaml` -- create `template_sbatch_nodes.yaml` (only required for running `main_sbatch.py --nodes ` on slurm clusters) - - `cp template_sbatch_nodes_.yaml template_sbatch_nodes.yaml` -- create `template_sbatch_gpus.yaml` (only required for running `main_sbatch.py --gpus ` on slurm clusters) - - `cp template_sbatch_gpus_.yaml template_sbatch_gpus.yaml` - -## Run - -### Runs require the following arguments - -- `--hp ` e.g. `--hp hyperparams.yaml` define what to run -- `--devices ` e.g. `--devices 0` to run on GPU0 or `--devices 0,1,2,3` to run on 4 GPUs - -### Run with SLURM - -`python main_sbatch.py --time 2-00:00:00 --qos default --nodes 1 ADDITIONAL_ARGUMENTS` -`python main_sbatch.py --time 2-00:00:00 --qos default --nodes 1 --hp --name ` - -### Optional arguments (most important ones) - -- `--name ` what name to assign in wandb -- `--wandb_config ` what wandb configuration to use (by default the `wandb_config.yaml` in the MLPlayground - directory will be used) - - only required if you have either `default_wandb_mode` to `online`/`offline` or pass `--wandb_mode ` - which is `online`/`offline` (a warning will be logged if you specify it with `wandb_mode=disabled`) -- `--num_workers` specify how many workers will be used for data loading - - by default `num_workers` will be `number_of_cpus / number_of_gpus` - -### Development arguments - -- `--accelerator cpu` runs on cpu (can still use multiple devices for debugging multi-gpu runs but with cpu) -- `--mindatarun` adjusts datasets length, epochs, logger intervals and batchsize to a minimum -- `--minmodelrun` replaces all values in the hp yaml of the pattern `${select:model_key:${yaml:models/...}}` - with `${select:debug:${yaml:models/...}}` - - you can define your model size with a model key and it will automatically replace it with a minimal model - - e.g. `encoder_model_key: tiny` for a ViT-T as encoder - with `encoder_params: ${select:${vars.encoder_model_key}:${yaml:models/vit}}` will select a very light-weight ViT -- `--testrun` combination of `--mindatarun` and `--minmodelrun` - -## Data setup +TODO -#### data_loading_mode == "local" -- ImageFolder datasets can be stored as zip files (see SETUP.md for creating these) - - 1 zip per split (slow unpacking): ImageNet/train -> ImageNet/train.zip - - 1 zip per class per split (fast unpacking): ImageNet/train/n1830348 -> ImageNet/train/n1830348.zip -- sync zipped folders to other servers `rsync -r /localdata/imagenet1k host:/data/` +# Train your own models -## Resume run +Instructions to setup the codebase on your own environment are provided in SETUP_CODE, SETUP_DATA and SETUP_MODELS. -### Via CLI -- `--resume_stage_id ` resume from `cp=latest` -- `--resume_stage_id --resume_checkpoint E100` resume from epoch 100 -- `--resume_stage_id --resume_checkpoint U100` resume from update 100 -- `--resume_stage_id --resume_checkpoint S1024` resume from sample 1024 +# Citation -### Via yaml -add a resume initializer to the trainer +If you like our work, please consider giving it a star :star: and cite us ``` -trainer: - ... - initializer: - kind: resume_initializer - stage_id: ??? - checkpoint: - epoch: 100 +@article{alkin2024mimrefiner, + title={TODO}, + author={TODO}, + journal={TODO}, + year={TODO} +} ``` \ No newline at end of file diff --git a/SETUP_CODE.md b/SETUP_CODE.md index 3209b6c..9c7fe31 100644 --- a/SETUP_CODE.md +++ b/SETUP_CODE.md @@ -59,59 +59,58 @@ yamls create a folder `wandb_configs`, copy the `template_wandb_config.yaml` int `entity`/`project` in this file and rename it to `v4.yaml`. Every run that defines `wandb: v4` will now fetch the details from this file and log your metrics to this W&B project. +## SLURM config -## Run +This codebase supports runs in SLURM environments. For this, you need to provide some additional configurations. +Copy the `template_sbatch_config.yaml`, rename it to `sbatch_config.yaml` and adjust the values to your setup. -### Runs require the following arguments +Copy the `template_sbatch_nodes_github.sh`, rename it to `template_sbatch_nodes.sh` and adjust the values to your setup. -- `--hp ` e.g. `--hp hyperparams.yaml` define what to run -- `--devices ` e.g. `--devices 0` to run on GPU0 or `--devices 0,1,2,3` to run on 4 GPUs -### Run with SLURM +## Start Runs + +You can start runs with the `main_train.py` file. For example -`python main_sbatch.py --time 2-00:00:00 --qos default --nodes 1 ADDITIONAL_ARGUMENTS` -`python main_sbatch.py --time 2-00:00:00 --qos default --nodes 1 --hp --name ` +You can queue up runs in SLURM environments by running `python main_sbatch.py --hp --time