×

Active subspace of neural networks: structural analysis and universal attacks. (English) Zbl 07482298

MSC:

90C26 Nonconvex programming, global optimization
15A18 Eigenvalues, singular values, and eigenvectors
62G35 Nonparametric robustness
PDF BibTeX XML Cite
Full Text: DOI arXiv

References:

[1] S. Abdoli, L. G. Hafemann, J. Rony, I. B. Ayed, P. Cardinal, and A. L. Koerich, Universal Adversarial Audio Perturbations, arXiv preprint, arXiv:1908.03173, 2019.
[2] A. Aghasi, A. Abdi, N. Nguyen, and J. Romberg, Net-Trim: Convex pruning of deep neural networks with performance guarantee, in Proceedings of the Conference on Neural Information Processing Systems, 2017, pp. 3177-3186.
[3] L. Armijo, Minimization of functions having Lipschitz continuous first partial derivatives, Pacific J. mathematics, 16 (1966), pp. 1-3. · Zbl 0202.46105
[4] S. Baluja and I. Fischer, Adversarial Transformation Networks: Learning to Generate Adversarial Examples, arXiv preprint, arXiv:1703.09387, 2017.
[5] M. Behjati, S.-M. Moosavi-Dezfooli, M. S. Baghshah, and P. Frossard, Universal adversarial attacks on text classifiers, in Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), IEEE, 2019, pp. 7345-7349.
[6] E. G. Birgin, J. M. Martínez, and M. Raydan, Nonmonotone spectral projected gradient methods on convex sets, SIAM J. Optim., 10 (2000), pp. 1196-1211. · Zbl 1047.90077
[7] H. Cai, L. Zhu, and S. Han, ProxylessNAS: Direct Neural Architecture Search on Target Task and Hardware, arXiv preprint, arXiv:1812.00332, 2018.
[8] N. Carlini and D. Wagner, Towards evaluating the robustness of neural networks, in Proceedings of the 2017 IEEE Symposium on Security and Privacy (SP), IEEE, 2017, pp. 39-57.
[9] P. G. Constantine, Active Subspaces: Emerging Ideas for Dimension Reduction in Parameter Studies, SIAM, Philadelphia, PA, 2015. · Zbl 1431.65001
[10] P. G. Constantine, E. Dow, and Q. Wang, Active subspace methods in theory and practice: Applications to kriging surfaces, SIAM J. Sci. Comput., 36 (2014), pp. A1500-A1524. · Zbl 1464.62049
[11] P. G. Constantine, M. Emory, J. Larsson, and G. Iaccarino, Exploiting active subspaces to quantify uncertainty in the numerical simulation of the HyShot II scramjet, J. Comput. Phys., 302 (2015), pp. 1-20. · Zbl 1349.76153
[12] M. Courbariaux, I. Hubara, D. Soudry, R. El-Yaniv, and Y. Bengio, Binarized Neural Networks: Training Deep Neural Networks with Weights and Activations Constrained to +1 or -1, arXiv preprint, arXiv:1602.02830, 2016. · Zbl 1468.68183
[13] C. Cui and Z. Zhang, Stochastic collocation with non-Gaussian correlated process variations: Theory, algorithms and applications, IEEE Trans. Components Packaging Manuf. Tech., 9 (2019), pp. 1362-1375.
[14] C. Cui and Z. Zhang, High-dimensional uncertainty quantification of electronic and photonic IC with non-gaussian correlated process variations, IEEE Trans. Comput-Aided Design Integr. Circuits Syst., 39 (2020), pp. 1649-1661.
[15] L. Deng, P. Jiao, J. Pei, Z. Wu, and G. Li, GXNOR-Net: Training deep neural networks with ternary weights and activations without full-precision memory under a unified discretization framework, Neural Networks, 100 (2018), pp. 49-58. · Zbl 1434.68504
[16] G. K. Dziugaite, Z. Ghahramani, and D. M. Roy, A Study of the Effect of JPG Compression on Adversarial Images, arXiv preprint, arXiv:1608.00853, 2016.
[17] J. Frankle and M. Carbin, The Lottery Ticket Hypothesis: Finding Sparse, Trainable Neural Networks, arXiv preprint, arXiv:1803.03635, 2018.
[18] T. Garipov, D. Podoprikhin, A. Novikov, and D. Vetrov, Ultimate Tensorization: Compressing Convolutional and FC Layers Alike, arXiv preprint, arXiv:1611.03214, 2016.
[19] R. Ge, R. Wang, and H. Zhao, Mildly Overparametrized Neural Nets Can Memorize Training Data Efficiently, arXiv preprint, arXiv:1909.11837, 2019.
[20] R. G. Ghanem and P. D. Spanos, Stochastic finite element method: Response statistics, in Stochastic Finite Elements: A Spectral Approach, Springer, Cham, 1991, pp. 101-119.
[21] M. Ghashami, E. Liberty, J. M. Phillips, and D. P. Woodruff, Frequent directions: Simple and deterministic matrix sketching, SIAM J. Comput., 45 (2016), pp. 1762-1792. · Zbl 1348.65075
[22] I. J. Goodfellow, J. Shlens, and C. Szegedy, Explaining and Harnessing Adversarial Examples, arXiv preprint, arXiv:1412.6572, 2014.
[23] A. Graves, S. Fernández, F. Gomez, and J. Schmidhuber, Connectionist temporal classification: Labelling unsegmented sequence data with recurrent neural networks, in Proceedings of the 23rd International Conference on Machine Learning, ACM, 2006, pp. 369-376.
[24] N. Halko, P.-G. Martinsson, and J. A. Tropp, Finding structure with randomness: Probabilistic algorithms for constructing approximate matrix decompositions, SIAM Rev., 53 (2011), pp. 217-288. · Zbl 1269.65043
[25] S. Han, H. Mao, and W. J. Dally, Deep Compression: Compressing Deep Neural Networks with Pruning, Trained Quantization and Huffman Coding, arXiv preprint, arXiv:1510.00149, 2015.
[26] C. Hawkins and Z. Zhang, Bayesian Tensorized Neural Networks with Automatic Rank Selection, arXiv preprint, arXiv:1905.10478, 2019.
[27] Y. He, J. Lin, Z. Liu, H. Wang, L.-J. Li, and S. Han, AMC: AutoML for model compression and acceleration on mobile devices, in Proceedings of the European Conference on Computer Vision (ECCV), 2018, pp. 784-800.
[28] G. Hinton, O. Vinyals, and J. Dean, Distilling the Knowledge in a Neural Network, arXiv preprint, arXiv:1503.02531, 2015.
[29] W. Hoeffding, Probability inequalities for sums of bounded random variables, in The Collected Works of Wassily Hoeffding, Springer, Cham, 1994, pp. 409-426.
[30] D. W. Hosmer, Jr., S. Lemeshow, and R. X. Sturdivant, Applied Logistic Regression, John Wiley & Sons, New York, 2013. · Zbl 1276.62050
[31] I. Jolliffe, Principal component analysis, in International Encyclopedia of Statistical Science, Springer, Cham, 2011, pp. 1094-1096.
[32] C. Kanbak, S.-M. Moosavi-Dezfooli, and P. Frossard, Geometric robustness of deep networks: analysis and improvement, in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2018, pp. 4441-4449.
[33] V. Khrulkov and I. Oseledets, Art of singular vectors and universal adversarial perturbations, in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2018, pp. 8562-8570.
[34] D. P. Kingma and J. Ba, Adam: A Method for Stochastic Optimization, arXiv preprint, arXiv:1412.6980, 2014.
[35] A. Krizhevsky, I. Sutskever, and G. E. Hinton, Imagenet classification with deep convolutional neural networks, in Proceedings of the Conference on Neural Information Processing Systems, 2012, pp. 1097-1105.
[36] V. Lebedev, Y. Ganin, M. Rakhuba, I. Oseledets, and V. Lempitsky, Speeding-up Convolutional Neural Networks Using Fine-Tuned CP-Decomposition, arXiv preprint, arXiv:1412.6553, 2014.
[37] D. R. Lide, Handbook of mathematical functions, in A Century of Excellence in Measurements, Standards, and Technology, CRC Press, Boca Raton, FL, 2018, pp. 135-139.
[38] L. Liu, L. Deng, X. Hu, M. Zhu, G. Li, Y. Ding, and Y. Xie, Dynamic Sparse Graph for Efficient Deep Learning, arXiv preprint, arXiv:1810.00859, 2018.
[39] Z. Liu, M. Sun, T. Zhou, G. Huang, and T. Darrell, Rethinking the Value of Network Pruning, arXiv preprint, arXiv:1810.05270, 2018.
[40] S.-M. Moosavi-Dezfooli, A. Fawzi, O. Fawzi, and P. Frossard, Universal adversarial perturbations, in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2017, pp. 1765-1773.
[41] S.-M. Moosavi-Dezfooli, A. Fawzi, and P. Frossard, DeepFool: A simple and accurate method to fool deep neural networks, in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2016, pp. 2574-2582.
[42] P. Neekhara, S. Hussain, P. Pandey, S. Dubnov, J. McAuley, and F. Koushanfar, Universal Adversarial Perturbations for Speech Recognition Systems, arXiv preprint, arXiv:1905.03828, 2019.
[43] A. Novikov, D. Podoprikhin, A. Osokin, and D. P. Vetrov, Tensorizing neural networks, in Proceedings of the Conference on Neural Information Processing Systems, 2015, pp. 442-450.
[44] S. Oymak and M. Soltanolkotabi, Towards moderate overparameterization: Global convergence guarantees for training shallow neural networks, IEEE J. Sel. Areas Inform. Therory, 1 (2020), pp. 84-105.
[45] N. Papernot, P. McDaniel, S. Jha, M. Fredrikson, Z. B. Celik, and A. Swami, The limitations of deep learning in adversarial settings, in Proceedings of the 2016 IEEE European Symposium on Security and Privacy (EuroS&P), IEEE, 2016, pp. 372-387.
[46] A. Romero, N. Ballas, S. E. Kahou, A. Chassang, C. Gatta, and Y. Bengio, FitNets: Hints for Thin Deep Nets, arXiv preprint, arXiv:1412.6550, 2014.
[47] H. L. Royden, Real Analysis, Macmillan, New York, 2010. · Zbl 1191.26002
[48] T. M. Russi, Uncertainty Quantification with Experimental Data and Complex System Models, PhD thesis, UC Berkeley, Berkeley, CA, 2010.
[49] T. N. Sainath, B. Kingsbury, V. Sindhwani, E. Arisoy, and B. Ramabhadran, Low-rank matrix factorization for deep neural network training with high-dimensional output targets, in Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing, 2013, pp. 6655-6659.
[50] S. Scardapane, D. Comminiello, A. Hussain, and A. Uncini, Group sparse regularization for deep neural networks, Neurocomputing, 241 (2017), pp. 81-89.
[51] M. Schmidt, E. Berg, M. Friedlander, and K. Murphy, Optimizing costly functions with simple constraints: A limited-memory projected quasi-Newton algorithm, in Proceedings of the International Conference on Artificial Intelligence and Statistics, 2009, pp. 456-463.
[52] A. C. Serban and E. Poll, Adversarial Examples-A Complete Characterisation of the Phenomenon, arXiv preprint, arXiv:1810.01185, 2018.
[53] S. Shalev-Shwartz and T. Zhang, Accelerated proximal stochastic dual coordinate ascent for regularized loss minimization, in Proceedings of the International Conference on Machine Learning, 2014, pp. 64-72. · Zbl 1342.90103
[54] C. Szegedy, W. Zaremba, I. Sutskever, J. Bruna, D. Erhan, I. Goodfellow, and R. Fergus, Intriguing Properties of Neural Networks, arXiv preprint, arXiv:1312.6199, (2013).
[55] D. Xiu and G. E. Karniadakis, Modeling uncertainty in steady state diffusion problems via generalized polynomial chaos, Comput. Methods Appl. Mech. Engrg., 191 (2002), pp. 4927-4948. · Zbl 1016.65001
[56] D. Xiu and G. E. Karniadakis, The Wiener-Askey polynomial chaos for stochastic differential equations, SIAM J. Sci. Comput., 24 (2002), pp. 619-644. · Zbl 1014.65004
[57] S. Ye, X. Feng, T. Zhang, X. Ma, S. Lin, Z. Li, K. Xu, W. Wen, S. Liu, J. Tang, M. Fardad, X. Lin, Y. Liu, and Y. Wang, Progressive DNN Compression: A Key to Achieve Ultra-High Weight Pruning and Quantization Rates Using ADMM, arXiv preprint, arXiv:1903.09769, 2019.
[58] T. Young, D. Hazarika, S. Poria, and E. Cambria, Recent trends in deep learning based natural language processing, IEEE Comput. Intel. Mag., 13 (2018), pp. 55-75.
[59] V. P. Zankin, G. V. Ryzhakov, and I. Oseledets, Gradient Descent-Based D-Optimal Design for the Least-Squares Polynomial Approximation, arXiv preprint, arXiv:1806.06631, 2018.
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. It attempts to reflect the references listed in the original paper as accurately as possible without claiming the completeness or perfect precision of the matching.