Revisiting Residual Networks for Adversarial Robustness: An Architectural Perspective [arXiv]
(Left) Impact of architectural components on adversarial robustness on CIFAR-10, relative to that of adversarial training methods.
(Right) Progress of SotA robust accuracy against AutoAttack without additional data on CIFAR-10 with
The design of a block primarily comprises its topology, type of convolution and kernel size, choice of activation, and normalization. We examine these elements independently through controlled experiments and propose a novel residual block, dubbed RobustResBlock, based on our observations. An overview of RobustResBlock is provided below:
|
39.6M | 6.00G | 57.70 | 54.71 | [BaiduDisk] |
|
70.5M | 10.6G | 58.46 | 55.56 | [BaiduDisk] |
|
133M | 19.6G | 59.41 | 56.62 | [BaiduDisk] |
|
270M | 39.3G | 60.48 | 57.78 | [BaiduDisk] |
We allow the depth of each stage (
We allow the width (in terms of widening factors) of each stage (
|
Scale by | |||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|
5 | 10 | 5 | 10 | 2 | 10 | 24.0M | 5.25G | 56.05 | 53.14 | [BaiduDisk] | ||
5G | 4 | 11 | 4 | 13 | 4 | 6 | 24.5M | 5.71G | 56.89 | 53.87 | [BaiduDisk] | |
14 | 5 | 14 | 7 | 7 | 3 | 17.7M | 5.09G | 57.49 | 54.78 | [BaiduDisk] | ||
6 | 12 | 6 | 12 | 3 | 12 | 48.5M | 9.59G | 56.42 | 53.91 | [BaiduDisk] | ||
10G | 5 | 13 | 5 | 16 | 5 | 7 | 44.4M | 10.5G | 57.06 | 54.29 | [BaiduDisk] | |
17 | 7 | 17 | 9 | 8 | 4 | 39.3M | 9.74G | 58.06 | 55.45 | [BaiduDisk] | ||
9 | 14 | 8 | 14 | 4 | 14 | 90.4M | 18.6G | 57.11 | 54.48 | [BaiduDisk] | ||
20G | 7 | 16 | 7 | 18 | 7 | 8 | 81.7M | 20.4G | 58.02 | 55.34 | [BaiduDisk] | |
22 | 8 | 22 | 11 | 11 | 5 | 74.8M | 20.3G | 58.47 | 56.14 | [BaiduDisk] | ||
14 | 16 | 13 | 16 | 11 | 16 | 185M | 38.8G | 57.90 | 55.79 | [BaiduDisk] | ||
40G | 11 | 18 | 11 | 21 | 11 | 9 | 170M | 42.7G | 58.48 | 56.15 | [BaiduDisk] | |
27 | 10 | 28 | 14 | 13 | 6 | 147M | 40.4G | 58.76 | 56.59 | [BaiduDisk] |
We use the proposed compound scaling rule to scale RobustResBlock and present a portfolio of adversarially robust residual networks.
Table 3. Comparison to SotA methods with additional 500K data
Method | Model | ||||
---|---|---|---|---|---|
RST | WRN-28-10 | 36.5M | 5.20G | 59.53 | |
AWP | WRN-28-10 | 36.5M | 5.20G | 60.04 | |
HAT | WRN-28-10 | 36.5M | 5.20G | 62.50 | |
Gowal et al. | WRN-28-10 | 36.5M | 5.20G | 62.80 | |
Huang el al. | WRN-34-R | 68.1M | 19.1G | 62.54 | |
Ours | RobustResNet-A1 | 19.2M | 5.11G | 63.70 | [BaiduDisk] |
Ours | WRN-A4 | 147M | 40.4G | 65.79 | [BaiduDisk] |
from models.resnet import PreActResNet
depth = [D1, D2, D3]
channels = [16, 16*W1, 32*W2, 64*W3]
block_types = ['robust_res_block', 'robust_res_block', 'robust_res_block']
# Syntax
model = PreActResNet(
depth_configs=depth,
channel_configs=channels,
block_types=block_types,
scales=8,
base_width=10,
cardinality=4,
se_reduction=64
num_classes=10, # for CIFAR-10/SVHN/MNIST)
# See Table 2 "D&W" rows for D1, D2, D3 and W1, W2, W3, see below for examples
RobustResNet-A1 = PreActResNet(
depth_configs=[14, 14, 7],
channel_configs=[5, 7, 3],
...)
RobustResNet-A2 = PreActResNet(
depth_configs=[17, 17, 8],
channel_configs=[7, 9, 4],
...)
RobustResNet-A3 = PreActResNet(
depth_configs=[22, 22, 11],
channel_configs=[8, 11, 5],
...)
RobustResNet-A4 = PreActResNet(
depth_configs=[27, 28, 13],
channel_configs=[10, 14, 6],
...)
# If you prefer to use WRN's block but with our scalings
WRN-A1 = PreActResNet(
depth_configs=[14, 14, 7],
channel_configs=[5, 7, 3],
block_types = ['basic_block', 'basic_block', 'basic_block']
...)
from models.resnet import RobustResBlock
# See Table 1 above for the performance of RobustResBlock
block = RobustResBlock(
in_chs, out_chs,
kernel_size=3,
scales=8,
base_width=10,
cardinality=4,
se_reduction=64,
activation='ReLU',
normalization='BatchNorm')
Please see examples/compound_scaling.ipynb
- Download the checkpoints, which should contain the following:
arch_xxx/ -arch_xxx.log # training log -arch_xxx.yaml # configuration file -checkpoints/ -arch_xxx.pth # last epoch checkpoint -arch_xxx_best.pth # checkpoint for best robust acc on valid set
- Run the following lines to evaluate adversarial robustness
python eval_robustness.py \
--data "path to data" \
--config_file_path "path to configuration yaml file" \
--checkpoint_path "path to checkpoint pth file" \
--save_path "path to file for logging evaluation" \
--attack_choice [FGSM/PGD/CW/AA] \
--num_steps [1/20/40/0] \
--batch_size 100 # batch size for evaluation, adjust according to your GPU memory
Model | Clean | AA | |||||
---|---|---|---|---|---|---|---|
WRN-28-10 | 36.5M | 5.20G | 84.62 | 55.90 | 53.15 | 51.66 | [BaiduDisk] |
RobNet-large-v2 | 33.3M | 5.10G | 84.57 | 52.79 | 48.94 | 47.48 | [BaiduDisk] |
AdvRush | 32.6M | 4.97G | 84.95 | 56.99 | 53.27 | 52.90 | [BaiduDisk] |
RACL | 32.5M | 4.93G | 83.91 | 55.98 | 53.22 | 51.37 | [BaiduDisk] |
RRN-A1 (ours) | 19.2M | 5.11G | 85.46 | 58.47 | 55.72 | 54.42 | [BaiduDisk] |
WRN-34-12 | 66.5M | 9.60G | 84.93 | 56.01 | 53.53 | 51.97 | [BaiduDisk] |
WRN-34-R | 68.1M | 19.1G | 85.80 | 57.35 | 54.77 | 53.23 | [BaiduDisk] |
RRN-A2 (ours) | 39.0M | 10.8G | 85.80 | 59.72 | 56.74 | 55.49 | [BaiduDisk] |
WRN-46-14 | 128M | 18.6G | 85.22 | 56.37 | 54.19 | 52.63 | [BaiduDisk] |
RRN-A3 (ours) | 75.9M | 19.9G | 86.79 | 60.10 | 57.29 | 55.84 | [BaiduDisk] |
WRN-70-16 | 267M | 38.8G | 85.51 | 56.78 | 54.52 | 52.80 | [BaiduDisk] |
RRN-A4 (ours) | 147M | 39.4G | 87.10 | 60.26 | 57.90 | 56.29 | [BaiduDisk] |
Model | Clean | AA | |||||
---|---|---|---|---|---|---|---|
WRN-28-10 | 36.5M | 5.20G | 56.30 | 29.91 | 26.22 | 25.26 | [BaiduDisk] |
RobNet-large-v2 | 33.3M | 5.10G | 55.27 | 29.23 | 24.63 | 23.69 | [BaiduDisk] |
AdvRush | 32.6M | 4.97G | 56.40 | 30.40 | 26.16 | 25.27 | [BaiduDisk] |
RACL | 32.5M | 4.93G | 56.09 | 30.38 | 26.65 | 25.65 | [BaiduDisk] |
RRN-A1 (ours) | 19.2M | 5.11G | 59.34 | 32.70 | 27.76 | 26.75 | [BaiduDisk] |
WRN-34-12 | 66.5M | 9.60G | 56.08 | 29.87 | 26.51 | 25.47 | [BaiduDisk] |
WRN-34-R | 68.1M | 19.1G | 58.78 | 31.17 | 27.33 | 26.31 | [BaiduDisk] |
RRN-A2 (ours) | 39.0M | 10.8G | 59.38 | 33.00 | 28.71 | 27.68 | [BaiduDisk] |
WRN-46-14 | 128M | 18.6G | 56.78 | 30.03 | 27.27 | 26.28 | [BaiduDisk] |
RRN-A3 (ours) | 75.9M | 19.9G | 60.16 | 33.59 | 29.58 | 28.48 | [BaiduDisk] |
WRN-70-16 | 267M | 38.8G | 56.93 | 29.76 | 27.20 | 26.12 | [BaiduDisk] |
RRN-A4 (ours) | 147M | 39.4G | 61.66 | 34.25 | 30.04 | 29.00 | [BaiduDisk] |
Model | |||||
---|---|---|---|---|---|
WRN-28-10 | 36.5M | 5.20G | 52.44 | 50.97 | [BaiduDisk] |
RRN-A1 (ours) | 19.2M | 5.11G | 57.62 | 56.06 | [BaiduDisk] |
WRN-34-12 | 66.5M | 9.60G | 52.85 | 51.36 | [BaiduDisk] |
RRN-A2 (ours) | 39.0M | 10.8G | 58.39 | 56.99 | [BaiduDisk] |
WRN-46-14 | 128M | 18.6G | 53.67 | 52.95 | [BaiduDisk] |
RRN-A3 (ours) | 75.9M | 19.9G | 58.81 | 57.60 | [BaiduDisk] |
WRN-70-16 | 267M | 38.8G | 54.12 | 50.52 | [BaiduDisk] |
RRN-A4 (ours) | 147M | 39.4G | 59.01 | 57.85 | [BaiduDisk] |
Model | |||||
---|---|---|---|---|---|
WRN-28-10 | 36.5M | 5.20G | 57.69 | 52.88 | [BaiduDisk] |
RRN-A1 (ours) | 19.2M | 5.11G | 59.34 | 54.42 | [BaiduDisk] |
WRN-34-12 | 66.5M | 9.60G | 57.40 | 53.11 | [BaiduDisk] |
RRN-A2 (ours) | 39.0M | 10.8G | 60.33 | 55.51 | [BaiduDisk] |
WRN-46-14 | 128M | 18.6G | 58.43 | 54.32 | [BaiduDisk] |
RRN-A3 (ours) | 75.9M | 19.9G | 60.95 | 56.52 | [BaiduDisk] |
WRN-70-16 | 267M | 38.8G | 58.15 | 54.37 | [BaiduDisk] |
RRN-A4 (ours) | 147M | 39.4G | 61.88 | 57.55 | [BaiduDisk] |
python -m torch.distributed.launch \
--nproc_per_node=2 --master_port 24220 \ # use a random port number
main_dist.py \
--config_path ./configs/CIFAR10 \
--exp_name ./exps/CIFAR10 \ # path to where you want to store training stats
--version [WRN-A1/A2/A3/A4] \ # you may also change it to RobustResNet-A1/A2/A3/A4
--train \
--data_parallel \
--apex-amp
Please download the additional pseudolabeled data from Carmon et al., 2019.
python -m torch.distributed.launch \
--nproc_per_node=8 --master_port 14226 \ # use a random port number
adv-main_dist.py \
--log-dir ./checkpoints/ \ # path to where you want to store training stats
--config-path ./configs/Advanced_CIFAR10
--version [WRN-A1/A2/A3/A4] \
--desc drna4-basic-silu-apex-500k \ # name of the folder for storing training stats
--apex-amp --adv-eval-freq 5 \ # evaluation frequency, will significantly slow down your training if too often
--start-eval 310 \ # start evaluating after N epochs
--apex_amp --advnorm --adjust_bn True \
--num-adv-epochs 400 --batch-size 1024 --lr 0.4 --weight-decay 0.0005 --beta 6.0 \
--data-dir /datasets/ --data cifar10s \
--aux-data-filename /datasets/ti_500K_pseudo_labeled.pickle \ # location to where you download the pseudolabeled data
--unsup-fraction 0.7
The code has been implemented and tested with Python 3.8.5
, PyTorch 1.8.0
, and apex(use for accel).
- RobustWRN: https://github.com/HanxunH/RobustWRN
- adversarial_robustness_pytorch: https://github.com/imrahulr/adversarial_robustness_pytorch
- MART: https://github.com/YisenWang/MART
- TREADES: https://github.com/yaodongyu/TRADES
- RST: https://github.com/yaircarmon/semisup-adv
- AutoAttack: https://github.com/fra31/auto-attack