Correcting for Label Noise

Learning from Errors

Section 3.2 of Ilyas et al. (2019) shows that training a model on only adversarial errors leads to non-trivial generalization on the original test set. We show that these experiments are a specific case of learning from errors.

A Counterintuitive Result

We take a completely mislabeled training set (without modifying the inputs) and use it to train a model that generalizes to the original test set. We then show that this result, and the results of Ilyas et al. (2019), are a special case of model distillation.

We begin with the following question: what if we took the images in the training set (without any adversarial perturbations) and mislabeled them? Since the inputs are unmodified and mislabeled, intuition says that a model trained on this dataset should not generalize to the correctly-labeled test set. Nevertheless, we show that this intuition fails – a model can generalize.

Two-dimensional Illustration of Model Distillation

We construct a dataset of adversarial examples using a two-dimensional binary classification problem. We generate 32 random two-dimensional data points in [0,1]^2 and assign each point a random binary label. We then train a small feed-forward neural network on these examples, predicting 32/32 of the examples correctly (panel (a) in the Figure below).

Next, we create adversarial examples for the original model using an l∞ ball of radius 0.12. In panel (a) of the Figure above, we display the ϵ-ball around each training point. In panel (b), we show the adversarial examples which cause the model to change its prediction (from correct to incorrect). We train a new feed-forward neural network on this dataset, resulting in the model in panel (c).

Conclusion

Our experiments show that a model can generalize from a mislabeled training set, even if the inputs are unmodified and the labels are incorrect. This is a special case of model distillation, where information about the original model is “leaked” into the mislabeled examples.

FAQs

Q: Why does this happen?
A: This phenomenon is a result of model distillation, where information about the original model is “leaked” into the mislabeled examples.

Q: Is this a problem?
A: No, this is not a problem. In fact, it shows that a model can generalize from a mislabeled training set, which can be useful in certain situations.

Q: Can this be used for good or bad?
A: This phenomenon can be used for both good and bad. For example, it can be used to train a model on a small dataset and then fine-tune it on a larger dataset, or it can be used to create a backdoor in a model.

Q: Is this a new phenomenon?
A: No, this phenomenon has been studied before in the context of model distillation. However, our experiments show that it can also occur in the context of mislabeled training sets.

Post Views: 22

Correcting for Label Noise

Learning from Errors

A Counterintuitive Result

Two-dimensional Illustration of Model Distillation

Conclusion

FAQs

Engineering confidence to navigate uncertainty | MIT News

Generate single title from this title Best of MWC 2026: Live updates on phones, concepts, and robots we’re seeing in 100 -150 characters. And...

Featured video: Coding for underwater robotics | MIT News

Generate single title from this title Upgrading agentic AI for finance workflows in 100 -150 characters. And it must return only title i dont...

Generate single title from this title Making Softmax More Efficient with NVIDIA Blackwell Ultra in 100 -150 characters. And it must return only title...

Engineering confidence to navigate uncertainty | MIT News

Generate single title from this title Best of MWC 2026: Live updates on phones, concepts, and robots we’re seeing in 100 -150 characters. And...

Featured video: Coding for underwater robotics | MIT News

Generate single title from this title Upgrading agentic AI for finance workflows in 100 -150 characters. And it must return only title i dont...

Generate single title from this title Making Softmax More Efficient with NVIDIA Blackwell Ultra in 100 -150 characters. And it must return only title...

Generate single title from this title Nvidia shares fall as blockbuster results fail to dazzle in 100 -150 characters. And it must return only...

Generate single title from this title It exposed what was already broken in 100 -150 characters. And it must return only title i dont...

What is a Performance Review + Definition?

LEAVE A REPLY Cancel reply

Latest

Engineering confidence to navigate uncertainty | MIT News

Generate single title from this title Best of MWC 2026: Live updates on phones, concepts, and robots we’re seeing in 100 -150 characters. And...

Featured video: Coding for underwater robotics | MIT News

Categories

Useful Links

Our Newsletter