Fighting Gradients with Gradients: Dynamic Defenses against Adversarial Attacks


Adversarial attacks optimize against models to defeat defenses. We argue that models should fight back, and optimize their defenses against attacks at test-time. Existing defenses are static, and stay the same once trained, even while attacks change. We propose a dynamic defense, defensive entropy minimization (dent), to adapt the model and input during testing by gradient optimization. Our dynamic defense adapts fully at test-time, without altering training, which makes it compatible with existing models and standard defenses. Dent improves robustness to attack by 20+ points absolute for state-of-the-art static defenses against AutoAttack on CIFAR-10.

arXiv preprint arXiv:2105.08714