Obfuscated Gradients Give a False Sense of Security: Circumventing Defenses to Adversarial Examples
ICMLFeb 1, 2018Best Paper
We identify obfuscated gradients, a kind of gradient masking, as a phenomenon
that leads to a false sense of security in defenses against adversarial
examples. While defenses that cause obfuscated gradients appear to defeat
iterative optimization-based attacks, we find defenses relying on this effect
can be circumvented. We describe characteristic behaviors of defenses
exhibiting the effect, and for each of the three types of obfuscated gradients
we discover, we develop attack techniques to overcome it. In a case study,
examining non-certified white-box-secure defenses at ICLR 2018, we find
obfuscated gradients are a common occurrence, with 7 of 9 defenses relying on
obfuscated gradients. Our new attacks successfully circumvent 6 completely, and
1 partially, in the original threat model each paper considers.