×

Increasing the usefulness of additive spline models by knot removal. (English) Zbl 1452.62033

Summary: Modern techniques for fitting generalized additive models mostly rely on basis expansions of covariates using a large number of basis functions and penalized estimation of parameters. For example, a mixed model approach is used to fit a model for children’s lung function that allows for non-linear influence of several covariates available in a substantial data set. While the resulting model is expected to have good prediction performance, its handling beyond simple visual presentation is problematic. It is shown how the number basis functions of the underlying B-spline representation can be reduced by knot removal techniques without refitting, while preserving the shape of the fitted functions. The condition for exact knot removal is extended towards approximate knot removal by incorporating the covariance matrix of the initial parameter estimates, resulting in considerable simplification of the model. Covariance matrices for the transformed parameter estimates are provided. It is demonstrated that enforcing the knot removal condition during estimation leads to the difference penalties employed in the P-spline approach for estimation of B-spline coefficients, and therefore provides a further justification for this type of penalty. A final transform to a truncated power basis provides a simple equation for the model. This increases transportability, while retaining properties of the initial fit such as good prediction performance.

MSC:

62-08 Computational methods for problems pertaining to statistics
62G07 Density estimation
62G08 Nonparametric regression and quantile regression
65D07 Numerical computation using splines

Software:

SemiPar; gamair
PDFBibTeX XMLCite
Full Text: DOI

References:

[1] Binder, H.; Tutz, G., A comparison of methods for the fitting of generalized additive models, Statist. Comput., 18, 1, 87-99 (2008)
[2] Boehm, W., Inserting new knots into B-spline curves, Comput. Aided Design, 12, 4, 199-201 (1980)
[3] Breiman, L., Fitting additive models to regression data, Comput. Statist. Data Anal., 15, 13-46 (1993) · Zbl 0937.62613
[4] de Boor, C., A Practical Guide to Splines (2001), Springer: Springer New York · Zbl 0987.65015
[5] Efron, B., The estimation of prediction error: Covariance penalties and cross-validation, J. Amer. Statist. Assoc., 99, 467, 619-632 (2004) · Zbl 1117.62324
[6] Eilers, P. H.C.; Marx, B. D., Flexible smoothing with B-splines and penalties, Statist. Sci., 11, 2, 89-121 (1996) · Zbl 0955.62562
[7] Eilers, P.H.C., Marx, B.D., 2004. Splines, knots, and penalties (submitted for publication); Eilers, P.H.C., Marx, B.D., 2004. Splines, knots, and penalties (submitted for publication)
[8] Friedman, J. H., Multivariate adaptive regression splines, Ann. Statist., 19, 1, 1-67 (1991) · Zbl 0765.62064
[9] Friedman, J. H.; Silverman, B. W., Flexible parsimonious smoothing and additive modeling, Technometrics, 31, 1, 3-21 (1989) · Zbl 0672.65119
[10] Gervini, D., Free-knot spline smoothing for functional data, J. Roy. Statist. Soc. Ser. B, 68, 4, 671-688 (2006) · Zbl 1110.62044
[11] Govindarajulu, U. S.; Spiegelman, D.; Thurston, S. W.; Ganguli, B.; Eisen, E. A., Comparing smoothing techniques in Cox models for exposure-response relationships, Stat. Med., 26, 20, 3735-3752 (2007)
[12] Green, P. J.; Silverman, B. W., Nonparametric Regression and Generalized Linear Models (1994), Chapman & Hall: Chapman & Hall London · Zbl 0832.62032
[13] Hastie, T. J.; Tibshirani, R. J., Generalized Additive Models (1990), Chapman & Hall: Chapman & Hall London · Zbl 0747.62061
[14] Ihorst, G.; Frischer, T.; Horak, F.; Schumacher, M.; Kopp, M.; Forster, J.; Mattes, J.; Kuehr, J., Long- and medium-term ozone effects on lung growth including a broad spectrum of exposure, Eur. Respir. J., 23, 292-299 (2004)
[15] Kooperberg, C.; Bose, S.; Stone, C. J., Polychotomous regression, J. Amer. Statist. Assoc., 92, 437, 117-127 (1997) · Zbl 0890.62034
[16] Mao, W.; Zhao, L. H., Free-knot polynomial splines with confidence intervals, J. Roy. Statist. Soc. Ser. B, 65, 4, 901-919 (2003) · Zbl 1059.62044
[17] Marx, B. D.; Eilers, P. H.C., Direct generalized additive modelling with penalized likelihood, Comput. Statist. Data Anal., 28, 193-209 (1998) · Zbl 1042.62580
[18] Miyata, S.; Shen, X., Adaptive free-knot splines, J. Comput. Graph. Statist., 12, 1, 197-213 (2003)
[19] Molinari, N.; Daurès, J.-P.; Durand, J.-F., Regression splines for threshold selection in survival analysis, Stat. Med., 20, 2, 237-247 (2001)
[20] Molinari, N.; Durand, J.-F.; Sabatier, R., Bounded optimal knots for regression splines, Comput. Statist. Data Anal., 45, 159-178 (2004) · Zbl 1429.62131
[21] Osborne, M. R.; Presnell, B.; Turlach, B. A., Knot selection for regression splines via the LASSO, (S., Weisberg, Dimension Reduction, Computational Complexity, and Information. Dimension Reduction, Computational Complexity, and Information, Computing Science and Statistics, vol. 30 (1998), Interface Foundation of North America: Interface Foundation of North America Fairfax Station, VA), 44-49
[22] Royston, P.; Altman, D. G., Regression using fractional polynomials of continuous covariates: Parsimonious parametric modelling, J. Roy. Statist. Soc. Ser. C, 43, 3, 429-467 (1994)
[23] Ruppert, D., Selecting the number of knots for penalized splines, J. Comput. Graph. Statist., 11, 735-757 (2002)
[24] Ruppert, D.; Wand, M. P.; Carroll, R. J., Semiparametric Regression (2003), Cambridge University Press · Zbl 1038.62042
[25] Sauerbrei, W.; Royston, P., Building multivariable prognostic and diagnostic models: Transformation of the predictors by using fractional polynomials, J. Roy. Statist. Soc. Ser. A, 162, 1, 71-94 (1999)
[26] Speed, T., Comment on That BLUP is a good thing: The estimation of random effects by G. K. Robinson, Statist. Sci., 6, 1, 42-44 (1991)
[27] Stone, C. J.; Hansen, M. H.; Kooperberg, C.; Truong, Y. K., Polynomial splines and their tensor products in extended linear modeling: 1994 Wald memorial lecture, Ann. Statist., 25, 4, 1371-1470 (1997) · Zbl 0924.62036
[28] Tibshirani, R., Regression shrinkage and selection via the lasso, J. Roy. Statist. Soc. Ser. B, 58, 1, 267-288 (1996) · Zbl 0850.62538
[29] Tiller, W., Knot-removal algorithms for NURBS curves and surfaces, Comput. Aided Design, 24, 8, 445-453 (1992) · Zbl 0808.65007
[30] Welham, S. J.; Cullis, B. R.; Kenward, M. G.; Thompson, R., A comparison of mixed model splines for curve fitting, Aust. N. Z. J. Stat., 49, 1, 1-23 (2007) · Zbl 1117.62041
[31] Wood, S. N., Stable and efficient multiple smoothing parameter estimation for generalized additive models, J. Amer. Statist. Assoc., 99, 467, 673-686 (2004) · Zbl 1117.62445
[32] Wood, S. N., Generalized Additive Models. An Introduction with R (2006), Chapman & Hall/CRC: Chapman & Hall/CRC Boca Raton · Zbl 1087.62082
[33] Wyatt, J. C.; Altman, D. G., Commentary: Prognostic models: clinically useful of quickly forgotten, Br. Med. J., 311, 1539-1541 (1995)
[34] Zhou, S.; Shen, X., Spatially adaptive regression splines and accurate knot selection schemes, J. Amer. Statist. Assoc., 96, 453, 247-259 (2001) · Zbl 1014.62049
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.