Generalized linear array models with applications to multidimensional smoothing. (English) Zbl 1110.62090

Summary: Data with an array structure are common in statistics, and the design or regression matrix for analysis of such data can often be written as a Kronecker product. Factorial designs, contingency tables and smoothing of data on multidimensional grids are three such general classes of data and models. In such a setting, we develop an arithmetic of arrays which allows us to define the expectation of the data array as a sequence of nested matrix operations on a coefficient array. We show how this arithmetic leads to low storage, high speed computation in the scoring algorithm of the generalized linear model. We refer to a generalized linear array model and apply the methodology to the smoothing of multidimensional arrays. We illustrate our procedure with the analysis of three data sets: mortality data indexed by age at death and year of death, spatially varying microarray background data and disease incidence data indexed by age at death, year of death and month of death.


62J12 Generalized linear models (logistic models)
62P10 Applications of statistics to biology and medical sciences; meta analysis
65C60 Computational problems in statistics (MSC2010)
62H12 Estimation in multivariate analysis


FITPACK; mgcv; SemiPar; R
Full Text: DOI


[1] Akaike H., Biometrika 60 pp 255– (1973)
[2] DOI: 10.1145/355826.355831 · Zbl 0405.65011
[3] Breslow N. E., J. Am. Statist. Ass. 88 pp 9– (1993)
[4] DOI: 10.1109/TCS.1978.1084534 · Zbl 0397.93009
[5] Clayton D., Statist. Med. 6 pp 469– (1987)
[6] DOI: 10.1007/BF01404567 · Zbl 0377.65007
[7] Currie I. D., Statist. Modllng 4 pp 279– (2004)
[8] Dierckx P., Curve and Surface Fitting with Splines (1993) · Zbl 0782.41016
[9] M. Durban, I. D. Currie, and P. H. C. Eilers (2005 ) MultidimensionalP-spline mixed models: an efficient method for estimation of multivariate densities . To be published.
[10] Eilers P. H. C., Appl. Statist. 48 pp 307– (1999)
[11] DOI: 10.1016/j.csda.2004.07.008 · Zbl 1429.62020
[12] DOI: 10.1214/ss/1038425655 · Zbl 0955.62562
[13] Gower J. C., Util. Math. 21 pp 99– (1982)
[14] Green P. J., Biometrika 72 pp 527– (1985)
[15] Green P. J., Nonparametric Regression and Generalized Linear Models (1994) · Zbl 0832.62032
[16] Horn R. A., Topics in Matrix Analysis (1991) · Zbl 0729.15001
[17] J. Oeppen (2004 ) Personal communication.
[18] R Development Core Team, R: a Language and Environment for Statistical Computing (2004)
[19] S. J. Richards, J. G. Kirkby, and I. D. Currie (2005 ) The importance of year of birth in two-dimensional mortality data . To be published.
[20] Ruppert D., Semiparametric Regression (2003) · Zbl 1038.62042
[21] Schwarz G., Ann. Statist. 6 pp 461– (1978)
[22] Searle S. R., Matrix Algebra Useful for Statistics (1982) · Zbl 0555.62002
[23] Searle S. R., Variance Components (1992) · Zbl 0850.62007
[24] Silverman B. W., J. R. Statist. Soc. 47 pp 1– (1985)
[25] DOI: 10.1016/S0377-0427(00)00393-9 · Zbl 0966.65039
[26] DOI: 10.1111/1467-9876.00154 · Zbl 0956.62062
[27] Wahba G., J. R. Statist. Soc. 45 pp 133– (1983)
[28] Wand M. P., Comput. Statist. 18 pp 223– (2003)
[29] DOI: 10.1111/1467-9868.00240 · Zbl 04558581
[30] Wood S. N., R Package Version 1.1-5 (2004)
[31] Yates F., Technical Communication 35 pp 1– (1937)
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. It attempts to reflect the references listed in the original paper as accurately as possible without claiming the completeness or perfect precision of the matching.