×

Adaptive function-on-scalar regression with a smoothing elastic net. (English) Zbl 1470.62190

Summary: This paper presents a new methodology, called AFSSEN, to simultaneously select significant predictors and produce smooth estimates in a high-dimensional function-on-scalar linear model with sub-Gaussian errors. Outcomes are assumed to lie in a general real separable Hilbert space, \(\mathbb{H}\), while parameters lie in a subspace known as a Cameron-Martin space, \(\mathbb{K}\), which are closely related to Reproducing Kernel Hilbert Spaces, so that the parameter estimates inherit particular properties, such as smoothness or periodicity, without enforcing such properties on the data. We propose a regularization method in the style of an adaptive Elastic Net penalty that involves mixing two types of functional norms, providing a fine tune control of both the smoothing and variable selection in the estimated model. Asymptotic theory is provided in the form of a functional oracle property, and the paper concludes with a simulation study demonstrating the advantages of using AFSSEN over existing methods in terms of prediction error and variable selection.

MSC:

62R10 Functional data analysis
62H12 Estimation in multivariate analysis
62G08 Nonparametric regression and quantile regression
46E22 Hilbert spaces with reproducing kernels (= (proper) functional Hilbert spaces, including de Branges-Rovnyak and other structured spaces)

Software:

flm; fda (R); Rcpp; R; NLopt
PDF BibTeX XML Cite
Full Text: DOI arXiv

References:

[1] R: A Language and Environment for Statistical Computing (2018), R Foundation for Statistical Computing: R Foundation for Statistical Computing Vienna, Austria
[2] Algamal, Z. Y.; Lee, M. H., Regularized logistic regression with adjusted adaptive elastic net for gene selection in high dimensional cancer classification, Comput. Biol. Med., 67, 136-145 (2015)
[3] Antonini, R. G., Subgaussian random variables in Hilbert spaces, Rend. Semin. Mat. Univ. Padova, 98, 89-99 (1997) · Zbl 0892.60049
[4] Barber, R. F.; Reimherr, M.; Schill, T., The function-on-scalar lasso with applications to longitudinal GWAS, Electron. J. Stat., 11, 1351-1389 (2017) · Zbl 1362.62084
[5] Barbu, V.; Precupanu, T., Convexity and Optimization in Banach Spaces (2012), Springer Science & Business Media: Springer Science & Business Media New York · Zbl 1244.49001
[6] Bauschke, H. H.; Combettes, P. L., Convex Analysis and Monotone Operator Theory in Hilbert Spaces (2011), Springer Science & Business Media: Springer Science & Business Media New York · Zbl 1218.47001
[7] Bawa, R. K., Spline based computational technique for linear singularly perturbed boundary value problems, Appl. Math. Comput., 167, 225-236 (2005) · Zbl 1083.65513
[8] Berlinet, A.; Thomas-Agnan, C., Reproducing Kernel Hilbert Spaces in Probability and Statistics (2011), Springer Science & Business Media: Springer Science & Business Media New York
[9] Bierut, L. J.; Madden, P. A.; Breslau, N.; Johnson, E. O.; Hatsukami, D.; Pomerleau, O. F.; Swan, G. E.; Rutter, J.; Bertelsen, S.; Fox, L., Novel genes identified in a high-density genome wide association study for nicotine dependence, Human Mol. Genet., 16, 24-35 (2006)
[10] Bogachev, V. I., Gaussian Measures (1998), American Mathematical Society: American Mathematical Society New York · Zbl 0938.28010
[11] Boyd, S.; Vandenberghe, L., Convex Optimization (2004), Cambridge University Press: Cambridge University Press New York · Zbl 1058.90049
[12] Buldygin, V. V.; Kozachenko, Y. V., Sub-Gaussian random variables, Ukr. Math. Bull., 32, 483-489 (1980) · Zbl 0479.60012
[13] Chen, Y.; Goldsmith, J.; Ogden, R. T., Variable selection in function-on-scalar regression, Stat, 5, 88-101 (2016)
[14] Craig, S. J.; Blankenberg, D.; Parodi, A. C.L.; Paul, I. M.; Birch, L. L.; Savage, J. S.; Marini, M. E.; Stokes, J. L.; Nekrutenko, A.; Reimherr, M., Child Weight Gain Trajectories Linked To Oral Microbiota CompositionScientific Reports, 14030 (2018)
[15] Craig, S. J.C.; Kenney, A. M.; Lin, J.; Paul, I. M.; Birch, L. L.; Savage, J.; Marini, M. E.; Chiaromonte, F.; Reimherr, M. L.; Makova, K. D., Polygenic risk score based on weight gain trajectories is a strong predictor of childhood obesity, (BioRxiv (2020)), Article 606277 pp.
[16] Dunford, N.; Schwartz, J. T., Linear Operators: Part II: Spectral Theory: Self Adjoint Operators in Hilbert Space (1963), Interscience Publishers: Interscience Publishers New York · Zbl 0128.34803
[17] Eddelbuettel, D.; Francoiş, R., Rcpp: Seamless R and C++ integration, J. Stat. Softw., 40, 1-18 (2011)
[18] Fan, Y.; James, G. M.; Radchenko, P., Functional additive regression, Ann. Statist., 43, 2296-2325 (2015) · Zbl 1327.62252
[19] Fan, Z.; Reimherr, M., High-dimensional adaptive function-on-scalar regression, Econom. Stat., 1, 167-183 (2017)
[20] Feng, Y.; Lv, S.-G.; Hang, H.; Suykens, J. A., Kernelized elastic net regularization: Generalization bounds and sparse recovery, Neural Comput., 28, 525-562 (2016) · Zbl 1474.62124
[21] Gertheiss, J.; Maity, A.; Staicu, A.-M., Variable selection in generalized functional linear models, Stat., 2, 86-101 (2013)
[22] Hebiri, M.; Van De Geer, S., The smooth-lasso and other l1+ l2-penalized methods, Electron. J. Stat., 5, 1184-1226 (2011) · Zbl 1274.62443
[23] Hsing, T.; Eubank, R., Theoretical Foundations of Functional Data Analysis, with an Introduction to Linear Operators (2015), John Wiley & Sons: John Wiley & Sons New York · Zbl 1338.62009
[24] Hsu, D.; Kakade, S. M.; Zhang, T., A tail inequality for quadratic forms of subgaussian random vectors, Electron. Commun. Probab., 17, 1-6 (2012) · Zbl 1309.60017
[25] Huang, J.; Ma, S.; Zhang, C.-H., Adaptive lasso for sparse high-dimensional regression models, Statist. Sinica, 18, 1603-1618 (2008) · Zbl 1255.62198
[26] S.G. Johnson, The nlopt nonlinear-optimization package, 2014.
[27] Kokoszka, P.; Reimherr, M., Introduction To Functional Data Analysis (2017), Chapman and Hall/CRC: Chapman and Hall/CRC Boca Raton, Florida · Zbl 1411.62004
[28] Lian, H., Shrinkage estimation and selection for multiple functional regression, Statist. Sinica, 23, 51-74 (2013) · Zbl 1257.62041
[29] Matsui, H.; Konishi, S., Variable selection for functional regression models via the l1 regularization, Comput. Statist. Data Anal., 55, 3304-3310 (2011) · Zbl 1271.62140
[30] Morris, J. S., Functional regression, Annu. Rev. Stat. Appl., 2, 321-359 (2015)
[31] Parodi, A.; Reimherr, M., Simultaneous variable selection and smoothing for high-dimensional function-on-scalar regression, Electron. J. Stat., 12, 4602-4639 (2018) · Zbl 1433.62111
[32] Ramsay, J. O.; Silverman, B. W., Applied Functional Data Analysis: Methods and Case Studies (2007), Springer: Springer New York · Zbl 1011.62002
[33] Reiss, P. T.; Huang, L.; Mennes, M., Fast function-on-scalar regression with penalized basis expansions, Int. J. Biostat., 6, 1-28 (2010)
[34] Repapi, E.; Sayers, I.; Wain, L. V.; Burton, P. R.; Johnson, T.; Obeidat, M.; Zhao, J. H.; Ramasamy, A.; Zhai, G.; Vitart, V., Genome-wide association study identifies five loci associated with lung function, Nature Genet., 42, 36 (2010)
[35] Scott, L. J.; Mohlke, K. L.; Bonnycastle, L. L.; Willer, C. J.; Li, Y.; Duren, W. L.; Erdos, M. R.; Stringham, H. M.; Chines, P. S.; Jackson, A. U., A genome-wide association study of type 2 diabetes in finns detects multiple susceptibility variants, Science, 316, 1341-1345 (2007)
[36] Shor, N. Z., Minimization Methods for Non-Differentiable Functions, Vol. 3 (2012), Springer Science & Business Media: Springer Science & Business Media New York
[37] Stein, M. L., Interpolation of Spatial Data: Some Theory for Kriging (2012), Springer Science & Business Media: Springer Science & Business Media New York
[38] Yuan, M.; Cai, T. T., A reproducing kernel Hilbert space approach to functional linear regression, Ann. Statist., 38, 3412-3444 (2010) · Zbl 1204.62074
[39] Zhang, C.-H., Nearly unbiased variable selection under minimax concave penalty, Ann. Statist., 38, 894-942 (2010) · Zbl 1183.62120
[40] Zhao, P.; Yu, B., On model selection consistency of lasso, J. Mach. Learn. Res., 7, 2541-2563 (2006) · Zbl 1222.62008
[41] Zou, H., The adaptive lasso and its oracle properties, J. Amer. Statist. Assoc., 101, 1418-1429 (2006) · Zbl 1171.62326
[42] Zou, H.; Hastie, T., Regularization and variable selection via the elastic net, J. R. Stat. Soc. Ser. B Stat. Methodol., 67, 301-320 (2005) · Zbl 1069.62054
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. It attempts to reflect the references listed in the original paper as accurately as possible without claiming the completeness or perfect precision of the matching.