swMATH ID: 35857
Software Authors: Hongyi Zhang, Moustapha Cisse, Yann N. Dauphin, David Lopez-Paz
Description: mixup: Beyond Empirical Risk Minimization. Large deep neural networks are powerful, but exhibit undesirable behaviors such as memorization and sensitivity to adversarial examples. In this work, we propose mixup, a simple learning principle to alleviate these issues. In essence, mixup trains a neural network on convex combinations of pairs of examples and their labels. By doing so, mixup regularizes the neural network to favor simple linear behavior in-between training examples. Our experiments on the ImageNet-2012, CIFAR-10, CIFAR-100, Google commands and UCI datasets show that mixup improves the generalization of state-of-the-art neural network architectures. We also find that mixup reduces the memorization of corrupt labels, increases the robustness to adversarial examples, and stabilizes the training of generative adversarial networks.
Homepage: https://arxiv.org/abs/1710.09412
Source Code:  https://github.com/facebookresearch/mixup-cifar10
Keywords: Machine Learning; arXiv_cs.LG; arXiv_stat.ML
Related Software: ImageNet; CIFAR; PyTorch; MNIST; Wasserstein GAN; Adam; MS-COCO; Python; MixMatch; AutoAugment; SGDR; AlexNet; Fashion-MNIST; EfficientNet; Swin Transformer; S4L; DARTS; BERT; RandAugment; U-Net
Cited in: 14 Publications

Citations by Year