This repository provides tools for evaluating various foundation models on Earth Observation tasks. For detailed instructions on pretraining and deploying DOFA, please refer to the main DOFA repository.
Navigate into the root directory of this repository and do:
conda create -n dofa-pytorch python=3.10 --yes
conda activate dofa-pytorch
pip install -U openmim
pip install torch==2.1.2
mim install mmcv==2.1.0 mmsegmentation==1.2.2
pip install -e .
You currently do not need to install the ViT Adapter part below, as it is not used in the current version of the repository. It is optional, and relies on CUDA toolkit < 12
To use ViT Adapter
cd src/foundation_models/modules/ops/
sh make.sh
Pretrained model weights are available on Hugging Face.
- SenPa-MAE: The cloud storage link for the weights is here. Download the weights and place them in the
MODEL_WEIGHTS_DIR
directory. You can usegdown <UID>
to download the weights from the google drive link (the UID is the stuff between '/d/' and '/view?usp=drive_link')
You can set this environment variable in a .env in the root directory. The variables here are automatically exported and used by different scripts, so make sure to set the following variables:
MODEL_WEIGHTS_DIR=<path/to/your/where/you/want/to/store/weights>
TORCH_HOME=<path/to/your/where/you/want/to/store/torch/hub/weights>
DATASETS_DIR=<path/to/your/where/you/want/to/store/all/other/datasets>
GEO_BENCH_DIR=<path/to/your/where/you/want/to/store/GeoBench>
ODIR=<path/to/your/where/you/want/to/store/logs>
REPO_PATH=<path/to/this/repo>
When using any of the FMs, the init method will check whether it can find the pre-trained checkpoint of the respective FM in the above MODEL_WEIGHTS_DIR
and download it there if not found. If you do not change the env
variable, the default will be ./fm_weights
.
Some models depend on torch hub, which by default will load models to ~.cache/torch/hub
. If you would like to change the directory if this to
for example have a single place where all weights across the models are stored, you can also change
This repository includes the following models for evaluation:
- CROMA
- DOFA
- GFM
- RemoteCLIP
- SatMAE
- ScaleMAE
- Skyscript
- SoftCON
- AnySat
The following datasets are currently supported:
- GeoBench
- BigEarthNetV2
- Resisc45
To add a new model or dataset for evaluation, follow these steps:
-
Add a Model Wrapper:
- Create a new model wrapper in the
foundation_models
folder. - Add the new model to
__init__.py
for integration. - Register the model in
factory.py
by adding its name to make it accessible via themodel_type
parameter.
- Create a new model wrapper in the
-
Add a Dataset Wrapper:
- Create a new dataset wrapper in the
datasets
folder. - Register the dataset in
factory.py
to ensure access.
- Create a new dataset wrapper in the
-
Configuration Setup: This project is using hydra for experiment configuation:
In the configs directory there is a subdirectory for models and dataset, where you need to add a config file for the new dataset and model
Hydra is a powerful and flexible way for experiment configuration, however, it can be a bit confusing at the beginning, as there is now an interplay between different hierarchy levels plus environment variables.
We differentiate between three kinds of configs:
- Model Config: this includes all specific variables required for a particular model
- Dataset Config: this includes all specific variables required for a particular dataset
- Other Config: these variables will be applied on the command line and are additional args for experiment configuration such as the output directory, number of gpus etc.
Any model config parameter can always be overwritten by model.{param_name}={something}
, similar to the dataset dataset.{param_name}={something}
. In the main.py
script that has the hydra decorator they will all be merged and be available under the cfg
dictionary. For more examples, see the hydra overwrite examples.
Need to explain how to run LORA
We have implemented a series of unit tests that aim to test that there are no runtime bugs. They will also be run on new PRs to check that changes are not breaking other parts. To run the tests you need to pip install pytest
, and then from the root directory simply run pytest tests/
which will run all tests file inside the tests/
directory, or run pytest tests/test_{model_name}
for any specific unit test file in there.
To run evaluation on any of the models, you can use the following example:
export $(cat .env)
echo "Output Directory": $ODIR
echo "Model Size": $MODEL_SIZE
python src/main.py \
output_dir=${ODIR}/exps/dinov2_cls_linear_probe_benv2_rgb \
model=dinov2_cls_linear_probe \
dataset=benv2_rgb \
lr=0.002 \
task=classification \
num_gpus=0 \
num_workers=8 \
epochs=30 \
warmup_epochs=5 \
seed=13 \
The model and dataset arguments are the names of the config.yaml files specified under the src/configs
directory. Additional arguments can be passed to the command: basically, anything in src/main.py
that has cfg.{something}
passing the argument with the command line command will overwrite the configs with the dedicated values.
There is a convenience script for generating such shell scripts for running experiments.
scripts/generate_bash_scripts.py
You can modify this to your needs and it will generate a different shell script for every experiment you want to run stored in their own folders under scripts/<dataset>/run_<model>_<dataset>.sh
You can use the following command to run an experiment:
cd <path/to/this/repo>
sh scripts/<path/to/your/experiment>.sh
There is also a script included that can optimize hyperparameters with Ray Tune similar to the hydra setup above but with additional parameters for hparam tuning.
The python file generate_bash_scripts_ray_tune.py
can generate bash scripts that execute the src/hparam_ray_hdra.py
script with optimizing the learning rate and batch size. The additonal ray relevant parameters are cfg.ray.{something}
inside that script. Some defaults are provided, but if you need more specific control over ray tune configuration, additional ray arguments can be passed to the command line or a script with the plus sign. For more information you can see how the generate_bash_scripts_ray_tune.py
configures an experiment.
We welcome contributions! If you'd like to add new models, datasets, or evaluation scripts, please submit a pull request, and ensure that you have tested your changes.