YourBench is pytorch library, which takes the user's model as input and evaluates how robust it is to adversarial attack. To make it easier for model developers to do adversarial training, the evaluation index of the model is provided along with the Report.
Adversarial attacks are the most representative way to attack deep learning models. Conversely, the method of training a deep learning model can be used to attack the model, preventing the model from making correct predictions. It is the same data to the human eye, but when it is input to the model, it can produce completely different results. Even if the model classifies the test images well, it will be difficult to use if it is vulnerable to such adversarial attacks.
Even if the model gives sufficiently reliable results on the test data, it becomes unusable if it is vulnerable to simple data manipulation. Adversarial attack and model robustness are the relationship between police and thieves. Because they are constantly evolving and trying to catch up with each other. Even if your current neural network or model is robust against adversarial attacks, new attack techniques may appear at any time. Therefore, it is important to be prepared always for new attack techniques from the model developer's point of view. However, it is costly and time consuming. So the process of checking how robust your neural network is against known strong adversarial attacks is also important.
Unlike other libraries, YourBench takes personal neural networks as input and provides Benchmark scores for adversarial attacks with report. Report suggests weaknesses in the model and gives way to improve them. Developers can assess the stability of their model through this..
pip install yourbench
YourBench places constraints on the measurable model to perform more accurate tests and provide reports.
- No Zero Gradients
Not recommended for models that use Vanishing/Exploding gradients, Shattered Gradients, and Stochastic Gradients, also known as Obfuscated Gradients. The model using the above gradients is not a suitable defense technique, and adversarial attack generation is very difficult. Models using obfuscated gradients are recommended to attack through EOT, BPDA, and reparameterizing. - No Loops in Forward Pass
Models with loops in the forward pass increase the cost of backpropagation and take a long time. For these models, we recommend an attack that can be adaptively applied by combining the loop loss and the task of the model.
- Running with built-in models
import yourbench
atk = yourbench.PGD(model, eps=8/255, alpha=2/255, steps=4)
adv_images = atk(images, labels)
- You can take data sets, models, and image labels from the user and do the following.
# label from mapping function
atk.set_mode_targeted_by_function(target_map_function=lambda images, labels:(labels+1)%10)
- Strong attacks
atk1 = torchattacks.FGSM(model, eps=8/255)
atk2 = torchattacks.PGD(model, eps=8/255, alpha=2/255, iters=40, random_start=True)
atk = torchattacks.MultiAttack([atk1, atk2])
- Binary serach for CW
atk1 = torchattacks.CW(model, c=0.1, steps=1000, lr=0.01)
atk2 = torchattacks.CW(model, c=1, steps=1000, lr=0.01)
atk = torchattacks.MultiAttack([atk1, atk2])
Name | Paper | Remark |
---|---|---|
FGSM (Linf) |
Explaining and harnessing adversarial examples (Goodfellow et al., 2014) | |
CW (L2) |
Towards Evaluating the Robustness of Neural Networks (Carlini et al., 2016) | |
PGD (Linf) |
Towards Deep Learning Models Resistant to Adversarial Attacks (Mardry et al., 2017) | Projected Gradient Method |
DeepFool (L2) |
DeepFool: A Simple and Accurate Method to Fool Deep Neural Networks (Moosavi-Dezfooli et al., 2016) |
Cite Robustbench to get the reliability of the score for the model.
Attack | Package | Standard | Wong2020Fast | Rice2020Overfitting | Remark |
---|---|---|---|---|---|
FGSM (Linf) | Torchattacks | 34% (54ms) | 48% (5ms) | 62% (82ms) | |
Foolbox* | 34% (15ms) | 48% (8ms) | 62% (30ms) | ||
ART | 34% (214ms) | 48% (59ms) | 62% (768ms) | ||
PGD (Linf) | Torchattacks | 0% (174ms) | 44% (52ms) | 58% (1348ms) | 👑 Fastest |
Foolbox* | 0% (354ms) | 44% (56ms) | 58% (1856ms) | ||
ART | 0% (1384 ms) | 44% (437ms) | 58% (4704ms) | ||
CW†?(L2) | Torchattacks | 0% / 0.40 (2596ms) |
14% / 0.61 (3795ms) |
22% / 0.56 (43484ms) |
👑 Highest Success Rate 👑 Fastest |
Foolbox* | 0% / 0.40 (2668ms) |
32% / 0.41 (3928ms) |
34% / 0.43 (44418ms) |
||
ART | 0% / 0.59 (196738ms) |
24% / 0.70 (66067ms) |
26% / 0.65 (694972ms) |
||
PGD (L2) | Torchattacks | 0% / 0.41 (184ms) | 68% / 0.5 (52ms) |
70% / 0.5 (1377ms) |
👑 Fastest |
Foolbox* | 0% / 0.41 (396ms) | 68% / 0.5 (57ms) |
70% / 0.5 (1968ms) |
||
ART | 0% / 0.40 (1364ms) | 68% / 0.5 (429ms) |
70% / 0.5 (4777ms) |
* Because FoolBox returns both accuracy and adversarial images, the actual image generation time may be shorter than stated.
Adversarial Attack will continue to be reborn in the future. YourBench intends to be a library used to prove the adversarial robustness of models in the future. Let me know if there are any new adversarial attacks in the future! If you would like to contribute to YourBench, please see below. CONTRIBUTING.md.
-
Adversarial Attack Packages:
- https://github.com/IBM/adversarial-robustness-toolbox: Adversarial attack and defense package made by IBM. TensorFlow, Keras, Pyotrch available.
- https://github.com/bethgelab/foolbox: Adversarial attack package made by Bethge Lab. TensorFlow, Pyotrch available.
- https://github.com/tensorflow/cleverhans: Adversarial attack package made by Google Brain. TensorFlow available.
- https://github.com/BorealisAI/advertorch: Adversarial attack package made by BorealisAI. Pytorch available.
- https://github.com/DSE-MSU/DeepRobust: Adversarial attack (especially on GNN) package made by BorealisAI. Pytorch available.
- https://github.com/fra31/auto-attack: Set of attacks that is believed to be the strongest in existence. TensorFlow, Pyotrch available.
-
Adversarial Defense Leaderboard:
-
Adversarial Attack and Defense Papers:
- https://nicholas.carlini.com/writing/2019/all-adversarial-example-papers.html: A Complete List of All (arXiv) Adversarial Example Papers made by Nicholas Carlini.
- https://github.com/chawins/Adversarial-Examples-Reading-List: Adversarial Examples Reading List made by Chawin Sitawarin.
- Seyed-Mohsen Moosavi-Dezfooli, Alhussein Fawzi, Pascal Frossard. DeepFool: a simple and accurate method to fool deep neural networks. CVPR, 2016
- Nicholas Carlini, David Wagner. Toward evaluating the robustness of neural networks. arXiv:1608.04644
-
ETC:
- https://github.com/Harry24k/gnn-meta-attack: Adversarial Poisoning Attack on Graph Neural Network.
- https://github.com/ChandlerBang/awesome-graph-attack-papers: Graph Neural Network Attack papers.
- https://github.com/Harry24k/adversarial-attacks-pytorch/blob/master/README_KOR.md
- https://sdc-james.gitbook.io/onebook/2.-1/1./1.1.5
- https://github.com/szagoruyko/pytorchviz
- https://github.com/Harry24k/adversarial-attacks-pytorch/blob/master/README_KOR.md
- https://www.koreascience.or.kr/article/JAKO202031659967733.pdf