BlackboxBench is a comprehensive benchmark containing mainstream adversarial black-box attack methods implemented based on PyTorch. It can be used to evaluate the adversarial robustness of any ML models, or as the baseline to develop more advanced attack and defense methods. Currently, we support:
- Attack methods:
- Query-based attack methods:
7 score-based attacks
: NES, ZOSignSGD, Bandit-prior, ECO attack, SimBA, SignHunter, Sqaure attack.8 decision-based attacks
: Boundary attack, OPT attack, Sign-OPT, Evoluationary attack, GeoDA, HSJA, Sign Flip, RayS.
- Transfer attack methods:
Coming soon!
- Query-based attack methods:
- Datasets: CIFAR-10, ImageNet.
- Models: Several CNN models pretrained on above two datasets.
We also provide a public leaderboard of evaluating above black-box attack performance against several undefended and defended deep models, on above two datasets.
BlackBoxBench will be continously updated by adding implementations of more attack and defense methods, as well as evaluations on more datasets and models. You are welcome to contribute your black-box attack methods to BlackBoxBench.
Table of Contents
- Requirements
- Usage
- Supported attacks
- Supported datasets
- Supported testing models
- Supported defense methods
- Citation
You can run the following script to configurate necessary environment
pip install -r requirement.txt
Before users run the main file attack_cifar10.py & attack_imagenet.py, they need to load pretrained model with .pth
file. The following part is an example of how to load Wide-Resnet-28
pretrained on CIFAR10
. Users need to put pretrained model file 'cifar_wrn_28.pth
' into 'pretrained_models/
' and change the file path accordingly in utils/model_loader.py.
elif model_name == 'wrn28':
TRAINED_MODEL_PATH = data_path_join('pretrained_models/wrn_adv/')
filename = 'cifar_wrn_28.pth'
pretrained_model = wrn.WideNet()
pretrained_model = torch.nn.DataParallel(pretrained_model)
checkpoint = torch.load(os.path.join(TRAINED_MODEL_PATH, filename))
# if hasattr(pretrained_model, 'module'):
# pretrained_model = pretrained_model.module
pretrained_model.load_state_dict(checkpoint['net'])
Users can modify the configuration file (***.json) to run different attack methods on different models with l_inf norm or l_2 norm. The following part is about how to modify a config-json file as desired. Here is an example config-json file for Signopt Attack
on Wide-Resnet-28
(CIFAR10
dataset).
{
"_comment1": "===== DATASET CONFIGURATION =====",
"dset_name": "cifar10", #Users can change the dataset here.
"dset_config": {},
"_comment2": "===== EVAL CONFIGURATION =====",
"num_eval_examples": 10000,
"_comment3": "=====ADVERSARIAL EXAMPLES CONFIGURATION=====",
"attack_name": "SignOPTAttack", #We choose Signopt attack method.
"attack_config": {
"batch_size": 1,
"epsilon": 255,
"p": "2", #set the perturbation norm to be l-2 norm, while "inf" represents l-infty norm.
"alpha": 0.2,
"beta": 0.001,
"svm": false,
"momentum": 0,
"max_queries": 10000, #We use unified maximum queries number to be 10000.
"k": 200,
"sigma": 0
},
"device": "gpu",
"modeln": "wrn28", #the name should be in accordance with the one in model_loader.py
"target": false, #Users can choose to run targeted attack(true) or untargeted attack(false).
"target_type": "median",
"seed":123
}
We set the maxium queries to be 10000
on all tests and the attack budget will be set uniformly by
CIFAR: l_inf:0.05 = 12.75/255, l_2: 1 = 255/255
ImageNet: l_inf: 0.05 = 12.75/255, l_2: 5 = 1275/255
where l_inf
represents l_infty norm perturbation and l_2
represents l_2 norm perturbation.
After modifying the attacks config files in config-jsons as desired, include config files of the considered attacks in attack_cifar10.py as follows (running attack on cifar-10 as an example):
python attack_cifar10.py ***.json
Score-Based Black-box attack | File name | Paper |
---|---|---|
NES Attack | nes_attack.py | Black-box Adversarial Attacks with Limited Queries and Information ICML 2018 |
ZO-signSGD | zo_sign_agd_attack.py | signSGD via Zeroth-Order Oracle ICLR 2019 |
Bandit Attack | bandit_attack.py | Prior Convictions: Black-Box Adversarial Attacks with Bandits and Priors ICML 2019 |
SimBA | simple_attack.py | Simple Black-box Adversarial Attacks ICML 2019 |
ECO Attack | parsimonious_attack.py | Parsimonious Black-Box Adversarial Attacks via Efficient Combinatorial Optimization ICML 2019 |
Sign Hunter | sign_attack.py | Sign Bits Are All You Need for Black-Box Attacks ICLR 2020 |
Square Attack | square_attack.py | Square Attack: a query-efficient black-box adversarial attack via random search ECCV 2020 |
Decision-Based Black-box attack | File name | Paper |
---|---|---|
Boundary Attack | boundary_attack.py | Decision-Based Adversarial Attacks: Reliable Attacks Against Black-Box Machine Learning Models ICLR 2017 |
OPT | opt_attack.py | Query-Efficient Hard-label Black-box Attack: An Optimization-based Approach ICLR 2019 |
Sign-OPT | sign_opt_attack.py | Sign OPT: A Query Efficient Hard label Adversarial Attack ICLR 2020 |
Evolutionary Attack | evo_attack.py | Efficient Decision based Blackbox Adversarial Attacks on Face Recognition CVPR 2019 |
GeoDA | geoda_attack.py | GeoDA: a geometric framework for blackbox adversarial attacks CVPR 2020 |
HSJA | hsja_attack.py | HopSkipJumpAttack: A Query Efficient Decision Based Attack IEEE S&P 2020 |
Sign Flip Attack | sign_flip_attack.py | Boosting Decision based Blackbox Adversarial Attacks with Random Sign Flip ECCV 2020 |
RayS | rays_attack.py | RayS: A Ray Searching Method for Hard-label Adversarial Attack KDD 2020 |
CIFAR-10, ImageNet. Please first download these two datasets into data/
. Here, we test the contained attack methods on the whole CIFAR-10 testing set and ImageNet competition dataset comprised of 1000 samples.
You can test all models trained on CIFAR-10 and ImageNet by adding loading code of your testing model in utils/model_loader.py. Here, we test the contained attack methods on the below models.
-
CIFAR-10: ResNet-50, WideResNet-28, AT-l_inf-WideResNet-28 (with extra data (Gowal et al., 2020)), AT-l_inf-WideResNet-28 (with data from DDPM (Rebuffi et al., 2021)). For ResNet-50 and WideResNet-28, we train them by using the code from this github repo.
-
ImageNet: ResNet-50, Inception-v3, AT-l_inf-ResNet-50 (4/255) (Salman et al., 2020), FastAT-l_inf-ResNet-50 (4/255) (Wong et al., 2020). For ResNet-50 and Inception-v3, we use the provided pretrained model from torchvision.
Here, we also provide several defense methods against black-box attacks.
- Random Noise Defense Against Query-Based Black-Box Attacks (RND) (Qin et al., 2021): RND is a lightweight and plug and play defense method against query-based attacks. It is realized by adding a random noise to each query at the inference time (one line code in Pytorch: x = x + noise_size * torch.randn like(x)). You can just tune the sigma (noise_size) to conduct RND in attack_cifar10.py & attack_imagenet.py.
If you want to use this library in your research, cite it as follows:
@misc{blackboxbench,
title={BlackboxBench (Python Library)},
author={Zeyu Qin and Xuanchen Yan and Baoyuan Wu},
year={2022},
url={https://github.com/SCLBD/BlackboxBench}
}
If interested, you can read our recent works about black-box attack and defense methods, and more works about trustworthy AI can be found here.
@inproceedings{cgattack-cvpr2022,
title={Boosting Black-Box Attack with Partially Transferred Conditional Adversarial Distribution},
author={Feng, Yan and Wu, Baoyuan and Fan, Yanbo and Liu, Li and Li, Zhifeng and Xia, Shutao},
booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
year={2022}
}
@article{rnd-blackbox-defense-nips2021,
title={Random Noise Defense Against Query-Based Black-Box Attacks},
author={Qin, Zeyu and Fan, Yanbo and Zha, Hongyuan and Wu, Baoyuan},
journal={Advances in Neural Information Processing Systems},
volume={34},
year={2021}
}
@inproceedings{liang2021parallel,
title={Parallel Rectangle Flip Attack: A Query-Based Black-Box Attack Against Object Detection},
author={Liang, Siyuan and Wu, Baoyuan and Fan, Yanbo and Wei, Xingxing and Cao, Xiaochun},
booktitle={Proceedings of the IEEE/CVF International Conference on Computer Vision},
pages={7697--7707},
year={2021}
}
@inproceedings{chen2020boosting,
title={Boosting decision-based black-box adversarial attacks with random sign flip},
author={Chen, Weilun and Zhang, Zhaoxiang and Hu, Xiaolin and Wu, Baoyuan},
booktitle={European Conference on Computer Vision},
pages={276--293},
year={2020},
organization={Springer}
}
@inproceedings{evolutionary-blackbox-attack-cvpr2019,
title={Efficient decision-based black-box adversarial attacks on face recognition},
author={Dong, Yinpeng and Su, Hang and Wu, Baoyuan and Li, Zhifeng and Liu, Wei and Zhang, Tong and Zhu, Jun},
booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
pages={7714--7722},
year={2019}
}
Zeyu Qin, Run Liu, Xuanchen Yan
The source code of this repository is licensed by The Chinese University of Hong Kong, Shenzhen under Creative Commons Attribution-NonCommercial 4.0 International Public License (identified as CC BY-NC-4.0 in SPDX). More details about the license could be found in LICENSE.
This project is built by the Secure Computing Lab of Big Data (SCLBD) at The Chinese University of Hong Kong, Shenzhen and Shenzhen Research Institute of Big Data, directed by Professor Baoyuan Wu. SCLBD focuses on research of trustworthy AI, including backdoor learning, adversarial examples, federated learning, fairness, etc.