Date:

Adversarial Examples Are Not Bugs, They Are Features

Response to Commenters

We want to thank all the commenters for the discussion and for spending time designing experiments, analyzing, replicating, and expanding upon our results. These comments helped us further refine our understanding of adversarial examples (e.g., by visualizing useful non-robust features or illustrating how robust models are successful at downstream tasks), but also highlighted aspects of our exposition that could be made more clear and explicit.

Organization of This Response

Our response is organized as follows: we first recap the key takeaways from our paper, followed by some clarifications that this discussion brought to light. We then address each comment individually, prefacing each longer response with a quick summary.

Terminology and Notation

We also recall some terminology from our paper that features in our responses:

  • Dataset: Our experiments involve the following variants of the given dataset DD (consists of sample-label pairs (xx, yy)) The exact details for construction of the datasets can be found in our paper, and the datasets themselves can be downloaded at http://git.io/adv-datasets :
  • D^R\widehat{\mathcal{D}}_{R}
  • D^NR\widehat{\mathcal{D}}_{NR}

Latest stories

Read More

LEAVE A REPLY

Please enter your comment!
Please enter your name here