Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Prepare target models before running attacks #249

Merged
merged 9 commits into from
May 14, 2024
Merged

Conversation

mzweilin
Copy link
Contributor

@mzweilin mzweilin commented May 2, 2024

What does this PR do?

This PR adds two preparations before running attacks in an external Lightning pipeline.

  1. Turn off the PyTorch inference mode, so that we can create perturbation variables that require gradients.
  2. Switch the target model to the training mode except for BatchNorm and Dropout layers, if we have to borrow training_step().

Type of change

Please check all relevant options.

  • Improvement (non-breaking)
  • Bug fix (non-breaking)
  • New feature (non-breaking)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • This change requires a documentation update

Testing

Please describe the tests that you ran to verify your changes. Consider listing any relevant details of your test configuration.

  • pytest
  • CUDA_VISIBLE_DEVICES=0 python -m mart experiment=CIFAR10_CNN_Adv trainer=gpu trainer.precision=16 reports 70% (21 sec/epoch).
  • CUDA_VISIBLE_DEVICES=0,1 python -m mart experiment=CIFAR10_CNN_Adv trainer=ddp trainer.precision=16 trainer.devices=2 model.optimizer.lr=0.2 trainer.max_steps=2925 datamodule.ims_per_batch=256 datamodule.world_size=2 reports 70% (14 sec/epoch).

Before submitting

  • The title is self-explanatory and the description concisely explains the PR
  • My PR does only one thing, instead of bundling different changes together
  • I list all the breaking changes introduced by this pull request
  • I have commented my code
  • I have added tests that prove my fix is effective or that my feature works
  • New and existing unit tests pass locally with my changes
  • I have run pre-commit hooks with pre-commit run -a command without errors

Did you have fun?

Make sure you had fun coding 🙃

Base automatically changed from mzweilin/add_batch_c15n_instances to main May 13, 2024 20:52
@mzweilin mzweilin requested a review from dxoigmn May 14, 2024 18:33
@@ -151,6 +151,8 @@ def configure_gradient_clipping(
for group in optimizer.param_groups:
self.gradient_modifier(group["params"])

# Turn off the inference mode, so we will create perturbation that requires gradient.
@torch.inference_mode(False)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this necessary now? I thought PL manages this already?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Anomalib turns on the inference mode as we run anomalib test.

MART's trainer turns off the inference mode by default, as in

inference_mode: False

But Anomalib has its own trainer.

self.training = self.module.training
self.module.train(True)
# Set some children modules of "excludes" to eval mode instead.
self.selective_eval_mode("", self.module, self.excludes)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is going on with the empty string?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't know the variable name of the model, so the module path starts with a dot. This is for debug logging only, which print messages like this

Set .model.student_model.feature_extractor.layer3[1].bn1: BatchNorm2d to eval mode.

with MonkeyPatch(pl_module, "log", lambda *args, **kwargs: None):
outputs = pl_module.training_step(batch, dataloader_idx)
with training_mode(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the use case here? Are you seeing train-specific code diverging from eval-specific code in some use case?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. Many model implementations return the prediction in the eval mode, and return the loss in the training mode.

In our use case, anomalib test runs the model in the eval mode, in which we won't get the loss.

@mzweilin mzweilin requested a review from dxoigmn May 14, 2024 22:25
@mzweilin mzweilin merged commit c117823 into main May 14, 2024
5 checks passed
@mzweilin mzweilin deleted the mzweilin/prepare_model branch May 14, 2024 23:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants