A new algorithm for fitting semi-parametric variance regression models. (English) Zbl 1505.62340

Summary: Variance regression allows for heterogeneous variance, or heteroscedasticity, by incorporating a regression model into the variance. This paper uses a variant of the expectation-maximisation algorithm to develop a new method for fitting additive variance regression models that allow for regression in both the mean and the variance. The algorithm is easily extended to allow for B-spline bases, thus allowing for the incorporation of a semi-parametric model in both the mean and variance. Although there are existing methods to fit these types of models, this new algorithm provides a reliable alternative approach that is not susceptible to numerical instability that can arise in this constrained estimation context. We utilise the developed algorithm with a series of simulation studies and analyse illustrative data. Various simulation studies show that the algorithm can recover the true model for a variety of scenarios. We also study automatic selection of model complexity based on information-based criteria, and show that the Akaike information criterion is useful for choosing the optimal number of knots in a B-spline model. An R package is available for implementing these methods.


62-08 Computational methods for problems pertaining to statistics
62F10 Point estimation
Full Text: DOI


[1] Aitkin, M., Modelling variance heterogeneity in normal regression using GLIM, J R Stat Soc: Ser C (Appl Stat), 36, 3, 332-339 (1987)
[2] Babu, G., Resampling methods for model fitting and model selection, J Biopharm Stat, 21, 6, 1177-1186 (2011)
[3] Crisp, A.; Burridge, J., A note on nonregular likelihood functions in heteroscedastic regression models, Biometrika, 81, 3, 585-587 (1994) · Zbl 0825.62389
[4] De Boor C (1978) A Practical Guide to Splines. Applied mathematical sciences (Springer-Verlag New York Inc.) ; v. 27. Springer-Verlag, New York
[5] Dempster, AP; Laird, NM; Rubin, DB, Maximum likelihood from incomplete data via the EM algorithm, J Roy Stat Soc: Ser B (Methodol), 39, 1, 1-38 (1977) · Zbl 0364.62022
[6] Donnelly, CA, The spatial analysis of covariates in a study of environmental epidemiology, Stat Med, 14, 21-22, 2393-2409 (1995)
[7] Donoghoe M, Marschner I (2018) logbin: An R package for relative risk regression using the log-binomial model. Journal of Statistical Software 86(9), 1-22. doi:10.18637/jss.v086.i09. https://www.jstatsoft.org/v086/i09
[8] Donoghoe MW, Marschner IC (2016) Fast stable relative risk regression using an overparameterised EM algorithm. In: Proceedings of the 31st International Workshop on Statistical Modelling, vol. 1, pp. 93-98
[9] Hastie T, Tibshirani R (1990) Generalized Additive Models, 1st, edition. Monographs on statistics and applied probability. Chapman and Hall, London · Zbl 0747.62061
[10] Hurvich CM, Simonoff JS, Tsai CL (1998) Smoothing parameter selection in nonparametric regression using an improved Akaike information criterion. Journal of the Royal Statistical Society. Series B (Statistical Methodology) 60(2), 271-293. http://www.jstor.org/stable/2985940 · Zbl 0909.62039
[11] Ling N, Vieu P (2020) On semiparametric regression in functional data analysis. WIREs Computational Statistics p. e1538. doi:10.1002/wics.1538. doi:10.1002/wics.1538
[12] Liu, C.; Rubin, DB, The ECME algorithm: a simple extension of EM and ECM with faster monotone convergence, Biometrika, 81, 4, 633-648 (1994) · Zbl 0812.62028
[13] Lumley T, Kronmal R, Ma S (2006) Relative risk regression in medical research: models, contrasts, estimators, and algorithms. University of Washington Biostatistics Working Paper Series. Working Paper 293. http://biostats.bepress.com/uwbiostat/paper293/
[14] Ma, S., A plug-in the number of knots selector for polynomial spline regression, J Nonparam Stat, 26, 3, 489-507 (2014) · Zbl 1305.62151
[15] Marschner, IC, Combinatorial EM algorithms, Stat Comput, 24, 6, 921-940 (2014) · Zbl 1332.62214
[16] Marschner IC (2015) Relative risk regression for binary outcomes: Methods and recommendations. Australian & New Zealand Journal of Statistics 57(4):437-462 doi:10.1111/anzs.12131.doi:10.1111/anzs.12131 · Zbl 1373.62383
[17] McLachlan, GJ; Krishnan, T., The EM algorithm and extensions (2007), New York: Wiley, New York · Zbl 1165.62019
[18] R Core Team: R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing, Vienna, Austria (2020). https://www.R-project.org/
[19] Ramsay, JO, Monotone regression splines in action, Stat Sci, 3, 4, 425-441 (1988)
[20] Robledo K (2018) VarReg: Semi-Parametric Variance Regression. https://CRAN.R-project.org/package=VarReg. R package version 1.0.2
[21] Ruppert, D., Selecting the number of knots for penalized splines, J Comput Graph Stat, 11, 4, 735-757 (2002)
[22] Ruppert, D.; Wand, MP; Carroll, RJ, Semiparametric regression. Cambridge series in statistical and probabilistic mathematics (2003), New York: Cambridge University Press, New York · Zbl 1038.62042
[23] Sigrist, MW, Air monitoring by spectroscopic techniques. Chemical analysis (1994), New York: Wiley, New York
[24] Smyth, GK, An efficient algorithm for REML in heteroscedastic regression, J Comput Graph Stat, 11, 4, 836-847 (2002)
[25] Varadhan, R.; Roland, C., Simple and globally convergent methods for accelerating the convergence of any EM algorithm, Scand J Stat, 35, 2, 335-353 (2008) · Zbl 1164.65006
[26] Venables WN, Ripley BD (2002) Modern Applied Statistics with S, fourth edition edn. Springer, New York. http://www.stats.ox.ac.uk/pub/MASS4 · Zbl 1006.62003
[27] Verbyla, AP, Modelling variance heterogeneity: residual maximum likelihood and diagnostics, J Roy Stat Soc: Ser B (Methodol), 55, 2, 493-508 (1993) · Zbl 0783.62051
[28] Wand M (2018) SemiPar: Semiparametic Regression. https://CRAN.R-project.org/package=SemiPar. R package version 1.0-4.2
[29] Yang, J.; Benyamin, B.; McEvoy, BP; Gordon, S.; Henders, AK; Nyholt, DR; Madden, PA; Heath, AC; Martin, NG; Montgomery, GW; Goddard, ME; Visscher, PM, Common snps explain a large proportion of the heritability for human height, Nat Genet, 42, 7, 565-9 (2010)
[30] Zhou, H.; Alexander, D.; Lange, K., A quasi-newton acceleration for high-dimensional optimization algorithms, Stat Comput, 21, 2, 261-273 (2011) · Zbl 1284.90095
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. It attempts to reflect the references listed in the original paper as accurately as possible without claiming the completeness or perfect precision of the matching.