Detailed Response
The main hypothesis in Ilyas et al. (2019) happens to be a special case of a more general principle that is commonly accepted in the robustness to distributional shift literature: a model’s lack of robustness is largely because the model latches onto superficial statistics in the data. In the image domain, these statistics may be unused by — and unintuitive to — humans, yet they may be useful for generalization in i.i.d. settings.
Separate experiments eschewing gradient perturbations and studying robustness beyond adversarial perturbations show similar results. For example, a recent work demonstrates that models can generalize to the test examples by learning from high-frequency information that is both naturally occurring and also inconspicuous. Concretely, models were trained and tested with an extreme high-pass filter applied to the data. The resulting high-frequency features appear completely grayscale to humans, yet models are able to achieve 50% top-1 accuracy on ImageNet-1K solely from these natural features that usually are “invisible.” These hard-to-notice features can be made conspicuous by normalizing the filtered image to have unit variance pixel statistics in the figure below.
Models can achieve high accuracy using information from the input that would be unrecognizable to humans. Shown above are models trained and tested with aggressive high and low pass filtering applied to the inputs. With aggressive low-pass filtering, the model is still above 30% on ImageNet when the images appear to be simple globs of color. In the case of high-pass (HP) filtering, models can achieve above 50% accuracy using features in the input that are nearly invisible to humans. As shown on the right hand side, the high pass filtered images needed be normalized in order to properly visualize the high frequency features.
Given the plethora of useful correlations that exist in natural data, we should expect that our models will learn to exploit them. However, models relying on superficial statistics can poorly generalize should these same statistics become corrupted after deployment. To obtain a more complete understanding of model robustness,

Model sensitivity to additive noise aligned with different Fourier basis vectors on CIFAR-10. We fix the additive noise to have ℓ2 norm 4 and evaluate three models: a naturally trained model, an adversarially trained model, and a model trained with Gaussian data augmentation. Error rates are averaged over 1000 randomly sampled images from the test set. In the bottom row we show images perturbed with noise along the corresponding Fourier basis vector. The naturally trained model is highly sensitive to additive noise in all but the lowest frequencies. Both adversarial training and Gaussian data augmentation dramatically improve robustness to additive noise in the high frequencies.
By taking a broader view of robustness beyond tiny ℓp norm perturbations, we discover that adversarially trained models are actually not “robust.” They are instead biased towards different kinds of superficial statistics. As a result, adversarial training can sacrifice robustness in real-world settings.
Conclusion
In conclusion, the lack of robustness in models is largely due to their reliance on superficial statistics in the data. By recognizing the importance of high-frequency features and the limitations of ℓp norm perturbations, we can gain a deeper understanding of model robustness and develop more effective strategies for improving it.
FAQs
Q: What is the main hypothesis in Ilyas et al. (2019)?
A: The main hypothesis is that a model’s lack of robustness is largely due to its reliance on superficial statistics in the data.
Q: What are superficial statistics in the data?
A: Superficial statistics are statistics that are not essential for generalization, but are instead useful for model performance.
Q: What is the importance of high-frequency features in model robustness?
A: High-frequency features are important because they can be used to improve model robustness and generalization.
Q: What are ℓp norm perturbations?
A: ℓp norm perturbations are perturbations that are added to the input data to test a model’s robustness.

