Reproducibility Study of “Studying How to Efficiently and Effectively Guide Models with Explanations”

This repository contains the code for a project aiming to reproduce the study titled ("Studying How to Efficiently and Effectively Guide Models with Explanations."). If you want to use this code, please cite the orginal paper:

@inproceedings{rao2023studying,
  title={Studying How to Efficiently and Effectively Guide Models with Explanations},
  author={Rao, Sukrut and B{\"o}hle, Moritz and Parchami-Araghi, Amin and Schiele, Bernt},
  booktitle={Proceedings of the IEEE/CVF International Conference on Computer Vision},
  pages={1922--1933},
  year={2023}
}

About the Project

While deep neural networks have excelled in diverse research domains, there is no assurance that these models are learning the correct features. Certain models may rely on spurious correlations in their predictions, such as putting undue attention on the background. This leads to unfair and inexplicable decisions and consequently limits a models ability to generalize. To inspect if models rely on spurious correlations in their decision-making, attribution methods have been developed. When incorporated alongside a classification model, these methods can steer the model's attention toward relevant features, ensure that the model is right for the right reasons.

This study aims to replicate the original paper, and investigates the reproducibility of the main claims in the paper

Folder Structure

├── README.md
├── environment.yml                                   - environment to run the code
├── datasets
│   ├── VOC2007                                       - will hold VOC-2007 data (automatically downloaded)
│   │   └── preprocess.py                             - download file for VOC-2007 data (will create new directories)
│   └── WATERBIRDS                                    - will hold Waterbirds-100% data (automatically downloaded)
│       └── preprocess.py                             - download file for Waterbirds-100% data (will create new directories)
├── examples                                          - example images for this README
├── BASE                                              - will hold the trained baseline models
│   ├── VOC2007
│   └── WATERBIRDS 
├── base_logs                                         - will hold the tensorboard logs for the baseline models
├── FT                                                - will hold the trained fine-tuned models
│   ├── VOC2007
│   └── WATERBIRDS
├── ft_logs                                           - will hold the tensorboard logs for the fine-tuned models
├── images                                            - will hold the images for this reproducibility study
├── p_c_ann                                           - will hold the .npz files for the paretto front of the models trained with the coarse annotations
├── p_c_dil                                           - will hold the .npz files for the paretto front of the models trained with the dilated annotations
├── p_curves                                          - will hold the .npz files for the paretto front of the models trained on different datasets
├── p_curves_bin0.002                                 - will hold the .npz files for the paretto front of the models trained on different datasets with a bin size of 0.002
├── weights                                           - will hold a script to download the pre-trained ImageNet weights
├── attribution_methods.py                            - defines the attribution methods
├── datasets.py                                       - defines the datasets
├── eval.py                                           - will hold the code to evaluate the models
├── fairness.py                                       - will hold the code to evaluate the fairness of the model on different classes
├── fixnpz.py                                         - will hold the code to fix the .npz files
├── fixup_resnet.py                                   - defines the fixup_resnet model
├── hubconf.py                                        - defines the configs for the models
├── losses.py                                         - defines the loss functions used for training
├── metrics.py                                        - defines metrics to be used during training and testing
├── model_activators.py                               - defines the ResNet model activators
├── pareto_dil_lim.py                                 - will hold the code to compute the paretto front of the models trained with the dilated annotations
├── pareto_FT.py                                      - will hold the code to compute the paretto front of the models trained on different datasets
├── tensorboard_proccessing.py                        - will hold the code to process the tensorboard logs
├── train.py                                          - main script to train the models
├── utils.py                                          - defines utility functions
└── visualize.py                                      - defines functions to visualize the results

Getting Started

Hardware Requirements

The code was developed and tested on a machine with a NVIDIA A100 40GB GPU. The code should run on any machine with a GPU that has at least 24GB of vram for training and 16GB of vram for evaluation/testing.

Install Packages

All required packages can be found in the environment.yml file. They are most easily installed with conda/miniconda, where a new environment can be easily created like this:

conda env create -f environment.yml

After that, activate the new environment (needs to be done everytime the shell is reloaded):

conda activate repro_model_guidance

Should you get any errors during the environment setup, make sure to set

conda config --set channel_priority false

and then try again.

Download the Data

The Pascal VOC-2007 training, validation and training sets should be automatically downloaded to datasets/VOC2007/ after you run the preprocess.py script in the corresponding directory. If that does not work, the training and validation data is available here and the testing data is available here.

The Waterbirds-100% dataset should be automatically downloaded to datasets/WATERBIRDS/ after you run the preprocess.py script in the corresponding directory. If that does not work, the Waterbirds-100% dataset is available here and the Caltech-UCSD Birds-200-2011 dataset, which contains the bounding_boxes.txt file, is available here.

For either dataset, the dataset folder is set as the default base directory in the code. If you move (or want to download) the datasets to a different place, make sure to point the code to the new directory as follows (additional arguments omitted, for more information on how to run the code, see the Training and Testing the Explainer section:

python train.py --data_path="<path to your directory containing the VOC2007 or WATERBIRDS subdirectories>"

The --data_base_path can either be an absolute path or a relative path from the src directory (which contains the main.py script).

To download the training, validation and training sets, run the following command:

python preprocess.py --split train
python preprocess.py --split val
python preprocess.py --split test

ImageNet Pre-trained Weights

A script to download the pre-trained ImageNet weights for B-cos and X-DNN backbones has been provided in the weights directory. Store ImageNet pre-trained weights for X-DNN and B-cos models there. To download them, run weights/download.sh

Demo Notebook

A demo notebook that contains code to regenerate all the key reproducibility results that are presented in our paper is included in this repository. It also contains the code to download the datasets and model weights.

Start a jupyterlab session using jupyter lab and run the notebook demo_notebook.ipynb to reproduce the results.

Training Models

To train a model, use:

python train.py [options]

The list of options and their descriptions can be found by using:

python train.py -h

Training without Model Guidance

For example, to train a B-cos model on VOC2007, use:

python train.py --model_backbone bcos --dataset VOC2007 --learning_rate 1e-4 --train_batch_size 64 --total_epochs 300

Fine-tuning with Model Guidance

For example, to optimize B-cos attributions using the Energy loss at the Input layer, use:

python train.py --model_backbone bcos --dataset VOC2007 --learning_rate 1e-4 --train_batch_size 64 --total_epochs 50 --optimize_explanations --model_path models/VOC2007/bcos_standard_attrNone_loclossNone_origNone_resnet50_lr1e-04_sll1.0_layerInput/model_checkpoint_f1_best.pt --localization_loss_lambda 1e-3 --layer Input --localization_loss_fn Energy --pareto

Evaluating and displaying different IoU thresholds

For example, to optimize B-cos attributions using the Energy loss at the Input layer, use:

python eval.py --model_path BASE/VOC2007/bcos_standard_attrNone_loclossNone_origNone_resnet50_lr0.0001_sll1.0_layerInput/model_checkpoint_final_300.pt --log_path ./base_logs/VOC2007/EVAL/ --dataset VOC2007 --fix_layer Input --vis_iou_thr_methods

Evaluating the Paretto Fronts on the test set

For example, to optimize B-cos attributions using the Energy loss at the Input layer, use:

python pareto_FT.py --data_path "datasets/" --save_path "p_curves/" --dataset "VOC2007" --split "test" --eval_batch_size 4 --seed 0

Results

Reproducibility study

Claim 1: Bounding Boxes are sufficient to guide models.

Although being relatively coarse, bounding box supervision can provide sufficient guidance to the models whilst being much cheaper to obtain than semantic segmentation masks. We replicate the results of the paper and show that bounding boxes are sufficient to guide models.

Claim 2: Energy loss constitutes a good loss function for model guidance.

The proposed EPG score constitutes a good loss function for model guidance, particularly when using bounding boxes. We replicate the results of the paper and show that the EPG score is a good loss function for model guidance.

Claim 3: Guiding the models can be done cost-effectively with sparse and noisy bounding boxes.

Model guidance can be accomplished cost-effectively by using annotation masks that are noisy or are available for only a small fraction of the training data. We replicate the results of the paper and show that model guidance can be done cost-effectively with sparse and noisy bounding boxes.

Claim 4: Model guidance can improve model generalization.

Model guidance with a small number of annotations can improve model generalization under distribution shifts at test time. We replicate the results of the paper and show that model guidance can improve model generalization.

Additional experiments and extensions

We conduct additional experiments and analysis to further investigate the main claims of the paper. This additional experiment aims to assess the class specific fairness of models utilizing guidance. While the original work shows improved localization performance for guided models, it is unclear if all categories benefit equally from model guidance

Acknowledgements

This repository uses and builds upon code from the following repositories:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Reproducibility Study of “Studying How to Efficiently and Effectively Guide Models with Explanations”

Table of Contents

About the Project

Folder Structure

Getting Started

Hardware Requirements

Install Packages

Download the Data

ImageNet Pre-trained Weights

Demo Notebook

Training Models

Training without Model Guidance

Fine-tuning with Model Guidance

Evaluating and displaying different IoU thresholds

Evaluating the Paretto Fronts on the test set

Results

Reproducibility study

Additional experiments and extensions

Acknowledgements

About

Releases

Packages

Contributors 4

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 250 Commits
BASE/VOC2007/bcos_standard_attrNone_loclossNone_origNone_resnet50_lr0.0001_sll1.0_layerInput		BASE/VOC2007/bcos_standard_attrNone_loclossNone_origNone_resnet50_lr0.0001_sll1.0_layerInput
FT		FT
bcos		bcos
datasets		datasets
images		images
metrics_per_model_bcos		metrics_per_model_bcos
metrics_per_model_vanilla		metrics_per_model_vanilla
p_c_ann/bcos		p_c_ann/bcos
p_c_dil/bcos		p_c_dil/bcos
p_curves/VOC2007		p_curves/VOC2007
p_curves_bin0.002/VOC2007		p_curves_bin0.002/VOC2007
p_curves_demo/VOC2007/bcos/Final/L1		p_curves_demo/VOC2007/bcos/Final/L1
weights		weights
.gitignore		.gitignore
CITATION.cff		CITATION.cff
LICENSE		LICENSE
README.md		README.md
attribution_methods.py		attribution_methods.py
demo_notebook.ipynb		demo_notebook.ipynb
environment.yml		environment.yml
eval.py		eval.py
evaluate_pareto.py		evaluate_pareto.py
fairness.py		fairness.py
fixup_resnet.py		fixup_resnet.py
hubconf.py		hubconf.py
losses.py		losses.py
metrics.py		metrics.py
model_activators.py		model_activators.py
pareto_FT.py		pareto_FT.py
pareto_dil_lim.py		pareto_dil_lim.py
pareto_pareto.py		pareto_pareto.py
train.py		train.py
utils.py		utils.py
visualize.py		visualize.py

License

rensdebee/FACTifAI_2024_3

Folders and files

Latest commit

History

Repository files navigation

Reproducibility Study of “Studying How to Efficiently and Effectively Guide Models with Explanations”

Table of Contents

About the Project

Folder Structure

Getting Started

Hardware Requirements

Install Packages

Download the Data

ImageNet Pre-trained Weights

Demo Notebook

Training Models

Training without Model Guidance

Fine-tuning with Model Guidance

Evaluating and displaying different IoU thresholds

Evaluating the Paretto Fronts on the test set

Results

Reproducibility study

Additional experiments and extensions

Acknowledgements

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 4

Languages

Packages