Date:

Correcting for Label Noise

Learning from Errors

Section 3.2 of Ilyas et al. (2019) shows that training a model on only adversarial errors leads to non-trivial generalization on the original test set. We show that these experiments are a specific case of learning from errors.

A Counterintuitive Result

We take a completely mislabeled training set (without modifying the inputs) and use it to train a model that generalizes to the original test set. We then show that this result, and the results of Ilyas et al. (2019), are a special case of model distillation.

We begin with the following question: what if we took the images in the training set (without any adversarial perturbations) and mislabeled them? Since the inputs are unmodified and mislabeled, intuition says that a model trained on this dataset should not generalize to the correctly-labeled test set. Nevertheless, we show that this intuition fails – a model can generalize.

Two-dimensional Illustration of Model Distillation

We construct a dataset of adversarial examples using a two-dimensional binary classification problem. We generate 32 random two-dimensional data points in [0,1]^2 and assign each point a random binary label. We then train a small feed-forward neural network on these examples, predicting 32/32 of the examples correctly (panel (a) in the Figure below).

2

Next, we create adversarial examples for the original model using an l∞ ball of radius 0.12. In panel (a) of the Figure above, we display the ϵ-ball around each training point. In panel (b), we show the adversarial examples which cause the model to change its prediction (from correct to incorrect). We train a new feed-forward neural network on this dataset, resulting in the model in panel (c).

Conclusion

Our experiments show that a model can generalize from a mislabeled training set, even if the inputs are unmodified and the labels are incorrect. This is a special case of model distillation, where information about the original model is “leaked” into the mislabeled examples.

FAQs

Q: Why does this happen?
A: This phenomenon is a result of model distillation, where information about the original model is “leaked” into the mislabeled examples.

Q: Is this a problem?
A: No, this is not a problem. In fact, it shows that a model can generalize from a mislabeled training set, which can be useful in certain situations.

Q: Can this be used for good or bad?
A: This phenomenon can be used for both good and bad. For example, it can be used to train a model on a small dataset and then fine-tune it on a larger dataset, or it can be used to create a backdoor in a model.

Q: Is this a new phenomenon?
A: No, this phenomenon has been studied before in the context of model distillation. However, our experiments show that it can also occur in the context of mislabeled training sets.

Latest stories

Read More

LEAVE A REPLY

Please enter your comment!
Please enter your name here