mixup swMATH ID: 35857 Software Authors: Hongyi Zhang, Moustapha Cisse, Yann N. Dauphin, David Lopez-Paz Description: mixup: Beyond Empirical Risk Minimization. Large deep neural networks are powerful, but exhibit undesirable behaviors such as memorization and sensitivity to adversarial examples. In this work, we propose mixup, a simple learning principle to alleviate these issues. In essence, mixup trains a neural network on convex combinations of pairs of examples and their labels. By doing so, mixup regularizes the neural network to favor simple linear behavior in-between training examples. Our experiments on the ImageNet-2012, CIFAR-10, CIFAR-100, Google commands and UCI datasets show that mixup improves the generalization of state-of-the-art neural network architectures. We also find that mixup reduces the memorization of corrupt labels, increases the robustness to adversarial examples, and stabilizes the training of generative adversarial networks. Homepage: https://arxiv.org/abs/1710.09412 Source Code: https://github.com/facebookresearch/mixup-cifar10 Keywords: Machine Learning; arXiv_cs.LG; arXiv_stat.ML Related Software: ImageNet; CIFAR; PyTorch; MNIST; Wasserstein GAN; Adam; MS-COCO; Python; MixMatch; AutoAugment; SGDR; AlexNet; Fashion-MNIST; EfficientNet; Swin Transformer; S4L; DARTS; BERT; RandAugment; U-Net Cited in: 14 Publications all top 5 Cited by 43 Authors 1 Blondé, Lionel 1 Chen, Songcan 1 Chen, Yiming 1 Chen, Yunfang 1 Cheng, Ran 1 Cheng, Xin 1 Chuang, Isaac L. 1 Dun, Hua 1 Feng, Xinxing 1 Gagné, Christian 1 Günnemann, Stephan 1 Hoos, Holger H. 1 Huang, Shengjun 1 Huang, Xin 1 Jain, Niharika 1 Jiang, Lu 1 Kalousis, Alexandros 1 Kambhampati, Subbarao 1 Khoo, Yuehaw 1 Kopetzki, Anna-Kathrin 1 Li, Shao-Yuan 1 Liang, Senwei 1 Ma, Tinghuai 1 Manikonda, Lydia 1 Northcutt, Curtis G. 1 Oberman, Adam M. 1 Olmo, Alberto 1 Pan, Tianci 1 Sengupta, Sailik 1 Shi, Ermin 1 Shi, Ye 1 Shu, Xin 1 Shui, Changjian 1 Strasser, Pablo 1 Szeliski, Richard 1 Tekalp, A. Murat 1 van Engelen, Jesper E. 1 Wang, Boyu 1 Xu, Long 1 Xu, Shubin 1 Yang, Haizhao 1 Yu, Suxiang 1 Zhang, Shuai all top 5 Cited in 7 Serials 5 Machine Learning 2 Mathematical Biosciences and Engineering 1 Artificial Intelligence 1 The Journal of Artificial Intelligence Research (JAIR) 1 Foundations and Trends in Computer Graphics and Vision 1 Texts in Computer Science 1 Communications on Applied Mathematics and Computation all top 5 Cited in 6 Fields 14 Computer science (68-XX) 3 Statistics (62-XX) 1 Partial differential equations (35-XX) 1 Numerical analysis (65-XX) 1 Biology and other natural sciences (92-XX) 1 Information and communication theory, circuits (94-XX) Citations by Year