Submitted:
11 May 2023
Posted:
12 May 2023
You are already at the latest version
Abstract

Keywords:
1. Introduction
- The limitations of traditional adversarial sample generation methods based on the decision boundary guidance of classifiers are broken through, and the generation mechanism of adversarial samples can be interpreted from the perspective of sample probability distribution.
- The transformation of the classification model into the energy-based model is achieved, making it possible to estimate the probability density of samples using the classification model.
- The classifier learns the probability distribution of the samples by aligning the gradient of the classifier with the logarithmic gradient of the probability density of the samples, which can guides the adversarial sample generation directionally and improves the transferability and interpretability in the face of different structural models.
2. Related Works
2.1. Adversarial Sample Attack Methods
2.2. Probability Density Estimation Methods
2.3. Score Matching Methods
3. Methodology
3.1. Estimation Of The Logarithmic Gradient Of The True Conditional Probability Density
3.2. Transformation Of Classification Model To EBM
3.3. Generation Of Adversarial Samples On Gradient-Aligned Classifiers
4. Experiments And Results
4.1. Experimental Settings
4.2. Metrics Comparison
4.2.1. Attack Target Model Experiments
4.2.2. Experiments On The CIFAR-100 Dataset
4.3. Visualization And Interpretability
4.3.1. Generating Adversarial Samples
4.3.2. Corresponding Feature Space
5. Conclusions
Author Contributions
Funding
Conflicts of Interest
References
- Naveed Akhtar and Ajmal Mian. “Threat of adversarial attacks on deep learning in computer vision: A survey”. In: Ieee Access 6 (2018), pp. 14410–14430.
- Christian Szegedy et al. Intriguing properties of neural networks. arXiv:1312.6199 2013.
- Ranjie Duan et al. “Adversarial camouflage: Hiding physical-world attacks with natural styles”. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 2020, pp. 1000–1008.
- Yu Zhang et al. “Boosting transferability of physical attack against detectors by redistributing separable attention”. In: Pattern Recognition 138 (2023), p. 109435.
- Max Welling and Yee W Teh. “Bayesian learning via stochastic gradient Langevin dynamics”. In: Proceedings of the 491 28th international conference on machine learning (ICML-11). 2011, pp. 681–688.
- Ian J Goodfellow, Jonathon Shlens, and Christian Szegedy. “Explaining and harnessing adversarial examples” (2014). arXiv:1412.6572 2014.
- Alexey Kurakin, Ian J Goodfellow, and Samy Bengio. “Adversarial examples in the physical world”. In: Artificial intelligence safety and security. Chapman and Hall/CRC, 2018, pp. 99–112.
- Yinpeng Dong et al. “Boosting adversarial attacks with momentum”. In: Proceedings of the IEEE conference on computer vision and pattern recognition. 2018, pp. 9185–9193.
- Aleksander Madry et al. “Towards deep learning models resistant to adversarial attacks” (2017). arXiv:1706.06083 2017.
- Cihang Xie et al. “Improving transferability of adversarial examples with input diversity”. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2019, pp. 2730–2739.
- Nicolas Papernot et al. “The limitations of deep learning in adversarial settings”. In: 2016 IEEE European symposium on security and privacy (EuroSȆP). IEEE. 2016, pp. 372–387.
- Karen Simonyan, Andrea Vedaldi, and Andrew Zisserman. “Deep inside convolutional networks: Visualising image classification models and saliency maps”. (2013). arXiv:1312.6034 2013.
- Nicholas Carlini and David Wagner. “Towards evaluating the robustness of neural networks”. In: 2017 ieee symposium on security and privacy (sp). Ieee. 2017, pp. 39–57.
- Jiawei Su, Danilo Vasconcellos Vargas, and Kouichi Sakurai. “One pixel attack for fooling deep neural networks”. In: IEEE Transactions on Evolutionary Computation 23.5 (2019), pp. 828–841.
- Nicolas Papernot et al. “Practical black-box attacks against machine learning”. In: Proceedings of the 2017 ACM on Asia conference on computer and communications security. 2017, pp. 506–519.
- Li Pengcheng, Jinfeng Yi, and Lijun Zhang. “Query-efficient black-box attack by active learning”. In: 2018 IEEE International Conference on Data Mining (ICDM). IEEE. 2018, pp. 1200–1205.
- Weibin Wu et al. “Boosting the transferability of adversarial samples via attention”. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2020, pp. 1161–1170.
- Yingwei Li et al. “Learning transferable adversarial examples via ghost networks”. In: Proceedings of the AAAI Conference on Artificial Intelligence. Vol. 34. 07. 2020, pp. 11458–11465.
- Yinpeng Dong et al. “Efficient decision-based black-box adversarial attacks on face recognition”. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2019, pp. 7714–7722.
- Nikolaus Hansen and Andreas Ostermeier. “Completely derandomized self-adaptation in evolution strategies”. In: Evolutionary computation 9.2 (2001), pp. 159–195.
- Thomas Brunner et al. “Guessing smart: Biased sampling for efficient black-box adversarial attacks”. In: Proceedings of the IEEE/CVF International Conference on Computer Vision. 2019, pp. 4958–4966.
- Yucheng Shi, Yahong Han, and Qi Tian. “Polishing decision-based adversarial noise with a customized sampling”. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2020, pp. 1030–1038.
- Ali Rahmati et al. “Geoda: a geometric framework for black-box adversarial attacks”. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 2020, pp. 8446–8455.
- Ian Goodfellow et al. “Generative adversarial networks”. In: Communications of the ACM 63.11 (2020), pp. 139–144.
- Chaowei Xiao et al. “Generating adversarial examples with adversarial networks”. In: arXiv preprint arXiv:1801.02610 (2018). arXiv:1801.02610 2018.
- Surgan Jandial et al. “Advgan++: Harnessing latent layers for adversary generation”. In: Proceedings of the IEEE/CVF International Conference on Computer Vision Workshops. 2019, pp. 0–0.
- Seyed-Mohsen Moosavi-Dezfooli, Alhussein Fawzi, and Pascal Frossard. “Deepfool: a simple and accurate method to fool deep neural networks”. In: Proceedings of the IEEE conference on computer vision and pattern recognition. 2016, pp. 2574–2582.
- Seyed-Mohsen Moosavi-Dezfooli et al. “Universal adversarial perturbations”. In: Proceedings of the IEEE conference on computer vision and pattern recognition. 2017, pp. 1765–1773.
- Bolei Zhou et al. “Learning deep features for discriminative localization”. In: Proceedings of the IEEE conference on computer vision and pattern recognition. 2016, pp. 2921–2929.
- James M Joyce. “Kullback-leibler divergence”. In: International encyclopedia of statistical science. Springer, 2011, pp. 720–722.
- XiaolongWang, Chengxiang Zhai, and Dan Roth. “Understanding evolution of research themes: a probabilistic generative model for citations”. In: Proceedings of the 19th ACM SIGKDD international conference on Knowledge discovery and data mining. 2013, pp. 1115–1123.
- Guillaume Alain et al. “GSNs: generative stochastic networks”. In: Information and Inference: A Journal of the IMA 5.2 (2016), pp. 210–249.
- Makoto Taketo et al. “FVB/N: an inbred mouse strain preferable for transgenic analyses.” In: Proceedings of the National Academy of Sciences 88.6 (1991), pp. 2065–2069.
- Mathieu Germain et al. “Made: Masked autoencoder for distribution estimation”. In: International conference on machine learning. PMLR. 2015, pp. 881–889.
- Aaron Van den Oord et al. “Conditional image generation with pixelcnn decoders”. In: Advances in neural information processing systems 29 (2016).
- Jinwon An and Sungzoon Cho. “Variational autoencoder based anomaly detection using reconstruction probability”. In: Special lecture on IE 2.1 (2015), pp. 1–18.
- Fernando Llorente et al. “MCMC-driven importance samplers”. In: Applied Mathematical Modelling 111 (2022), pp. 310–331.
- Yilun Du and Igor Mordatch. “Implicit generation and modeling with energy based models”. In: Advances in Neural Information Processing Systems 32 (2019).
- Aapo Hyvärinen and Peter Dayan. “Estimation of non-normalized statistical models by score matching.” In: Journal of Machine Learning Research 6.4 (2005).
- Yang Song et al. “Sliced score matching: A scalable approach to density and score estimation”. In: Uncertainty in Artificial Intelligence. PMLR. 2020, pp. 574–584.
- Will Grathwohl et al. “Your classifier is secretly an energy based model and you should treat it like one”. In: arXiv preprint arXiv:1912.03263 (2019). arXiv:1912.03263 2019.
- Jia Deng et al. “Imagenet: A large-scale hierarchical image database”. In: 2009 IEEE conference on computer vision and pattern recognition. Ieee. 2009, pp. 248–255.
- Kaiming He et al. “Deep residual learning for image recognition”. In: Proceedings of the IEEE conference on computer vision and pattern recognition. 2016, pp. 770–778.
- Sebastian Ruder. “An overview of gradient descent optimization algorithms”. In: arXiv preprint arXiv:1609.04747(2016). arXiv:1609.04747 2016.
- Dongxian Wu et al. “Skip connections matter: On the transferability of adversarial examples generated with resnets” (2020). arXiv:2002.05990 2020.
- Karen Simonyan and Andrew Zisserman. “Very deep convolutional networks for large-scale image recognition” (2014). arXiv:1409.1556 2014.
- Gao Huang et al. “Densely connected convolutional networks”. In: Proceedings of the IEEE conference on computer vision and pattern recognition. 2017, pp. 4700–4708.
- Christian Szegedy et al. “Inception-v4, inception-resnet and the impact of residual connections on learning”. In: Proceedings of the AAAI conference on artificial intelligence. Vol. 31. 1. 2017.
- Alexey Dosovitskiy et al. “An image is worth 16x16 words: Transformers for image recognition at scale” (2020). arXiv:2010.11929 2020.
- Hadi Salman et al. “Do adversarially robust imagenet models transfer better?” In: Advances in Neural Information Processing Systems 33 (2020), pp. 3533–3545.
- Robert Geirhos et al. “ImageNet-trained CNNs are biased towards texture; increasing shape bias improves accuracy and robustness” (2018). arXiv:1811.12231 2018.
- Dan Hendrycks et al. “Augmix: A simple method to improve robustness and uncertainty under data shift”. In: International conference on learning representations. Vol. 1. 4. 2020, p. 6.
- Alex Krizhevsky, Geoffrey Hinton, et al. “Learning multiple layers of features from tiny images”. In: (2009).
- Jonathon Shlens. “A tutorial on principal component analysis” (2014). arXiv:1404.1100 2014.










| Surrogate Model | Attack Method | VGG-19 | ResNet-50 | DenseNet-121 | Inception-V3 | ViT-B/16 |
|---|---|---|---|---|---|---|
| ResNet-18 | PGD [9] | 45.05% | 52.07% | 47.28% | 17.73% | 3.25% |
| MI-FGSM [8] | 55.13% | 62.24% | 62.48% | 33.22% | 11.34% | |
| -FGSM [10] | 63.80% | 63.80% | 69.68% | 36.19% | 6.05% | |
| C&W [13] | 58.83% | 60.67% | 64.48% | 32.22% | 7.17% | |
| SGM [45] | 70.31% | 75.14% | 71.03% | 45.66% | 16.00% | |
| SBMA(ours) | 82.02% | 83.48% | 81.46% | 76.20% | 47.88% |
| Surrogate Model | Attack Method | VGG-19 | ResNet-50 | DenseNet-121 | Inception-V3 | ViT-B/16 |
|---|---|---|---|---|---|---|
| ResNet-18 | PGD [9] | 1.21% | 1.84% | 2.28% | 5.30% | 0.48% |
| MI-FGSM [8] | 3.36% | 5.47% | 8.72% | 6.46% | 0.48% | |
| -FGSM [10] | 9.04% | 11.73% | 14.91% | 6.90% | 0.71% | |
| C&W [13] | 8.19% | 6.70% | 8.01% | 6.04% | 0.66% | |
| SGM [45] | 52.04% | 50.69% | 49.11% | 13.18% | 5.72% | |
| SBMA(ours) | 74.63% | 79.77% | 73.10% | 64.54% | 36.62% |
| Surrogate Model | Attack Method | Normal | Adversarial Training | SIN | Augmix |
|---|---|---|---|---|---|
| ResNet-18 | PGD [9] | 2.28% | 0.59% | 1.26% | 1.21% |
| MI-FGSM [8] | 8.72% | 1.83% | 4.71% | 4.54% | |
| -FGSM [10] | 14.91% | 3.60% | 8.40% | 8.10% | |
| C&W [13] | 8.01% | 1.64% | 4.23% | 4.08% | |
| SGM [45] | 49.11% | 9.22% | 28.53% | 27.50% | |
| SBMA(ours) | 73.10% | 15.13% | 60.27% | 54.62% |
| Surrogate Model | Attack Method | VGG-19 | ResNet-50 | DenseNet-121 | Inception-V3 | ViT-B/16 |
|---|---|---|---|---|---|---|
| ResNet-18 | PGD [9] | 0.98% | 1.56% | 1.93% | 4.49% | 0.37% |
| MI-FGSM [8] | 2.84% | 4.52% | 7.39% | 5.95% | 0.40% | |
| -FGSM [10] | 7.66% | 9.94% | 11.43% | 5.85% | 0.48% | |
| C&W [13] | 6.93% | 5.67% | 6.78% | 5.12% | 0.56% | |
| SGM [45] | 44.08% | 40.80% | 41.60% | 11.16% | 3.65% | |
| SBMA(ours) | 52.68% | 51.31% | 49.38% | 41.56% | 20.85% |
| Number of samples with 3 × 224 × 224 | Size of samples with 1300p | ||
|---|---|---|---|
| 1300p | 73.10% | 3 × 224 × 224 | 73.10% |
| 650p | 62.69% | 3 × 112 × 112 | 67.52% |
| 300p | 45.16% | 3 × 64 × 64 | 58.35% |
| 100p | 21.08% | 3 × 32 × 32 | 40.44% |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).