Monte Carlo modified profile likelihood in models for clustered data. (English) Zbl 1462.62381

Summary: The main focus of the analysts who deal with clustered data is usually not on the clustering variables, and hence the group-specific parameters are treated as nuisance. If a fixed effects formulation is preferred and the total number of clusters is large relative to the single-group sizes, classical frequentist techniques relying on the profile likelihood are often misleading. The use of alternative tools, such as modifications to the profile likelihood or integrated likelihoods, for making accurate inference on a parameter of interest can be complicated by the presence of nonstandard modelling and/or sampling assumptions. We show here how to employ Monte Carlo simulation in order to approximate the modified profile likelihood in some of these unconventional frameworks. The proposed solution is widely applicable and is shown to retain the usual properties of the modified profile likelihood. The approach is examined in two instances particularly relevant in applications, i.e. missing-data models and survival models with unspecified censoring distribution. The effectiveness of the proposed solution is validated via simulation studies and two clinical trial applications.


62H30 Classification and discrimination; cluster analysis (statistical aspects)
62N01 Censored data models
62D10 Missing data
62G20 Asymptotic properties of nonparametric inference
62P10 Applications of statistics to biology and medical sciences; meta analysis
65C05 Monte Carlo methods


bootlib; panelMPL; gee; R; survival
Full Text: DOI arXiv Euclid


[1] Agresti, A. (2015)., Foundations of linear and generalized linear models. John Wiley & Sons. · Zbl 1309.62001
[2] Baker, S. G. (1995). Marginal regression for repeated binary data with outcome subject to non-ignorable non-response., Biometrics51, 1042-1052. · Zbl 0875.62490
[3] Barndorff-Nielsen, O. E. (1980). Conditionality resolutions., Biometrika67, 293-310. · Zbl 0434.62005
[4] Barndorff-Nielsen, O. E. (1983). On a formula for the distribution of the maximum likelihood estimator., Biometrika70, 343-365. · Zbl 0532.62006
[5] Bartolucci, F., R. Bellio, A. Salvan, and N. Sartori (2016). Modified profile likelihood for fixed-effects panel data models., Econometric Reviews35, 1271-1289. · Zbl 1491.62181
[6] Bellio, R. and N. Sartori (2003). Extending conditional likelihood in models for stratified binary data., Statistical Methods and Applications12, 121-132. · Zbl 1056.62080
[7] Bellio, R. and N. Sartori (2006). Practical use of modified maximum likelihoods for stratified data., Biometrical journal48, 876-886. · Zbl 1442.62254
[8] Bellio, R. and N. Sartori (2015)., panelMPL: Modified profile likelihood estimation for fixed-effects panel data models. http://ruggerobellio.weebly.com/software.html.
[9] Carey, V. J., T. Lumley, and B. Ripley. (2015)., gee: Generalized Estimation Equation Solver. R package version 4.13-19.
[10] Carlin, B. and J. Hodges (1999). Hierarchical proportional hazards regression models for highly stratified data., Biometrics55, 1162-1170. · Zbl 1059.62624
[11] Cortese, G. and N. Sartori (2016). Integrated likelihoods in parametric survival models for highly clustered censored data., Lifetime Data Analysis22, 382-404. · Zbl 1372.62045
[12] Davison, A. C. and D. V. Hinkley (1997)., Bootstrap Methods and their Application. Cambridge University Press. · Zbl 0886.62001
[13] De Bin, R., N. Sartori, and T. Severini (2015). Integrated likelihoods in models with stratum nuisance parameters., Electronic Journal of Statistics9, 1474-1491. · Zbl 1327.62304
[14] Dempster, A. P., N. M. Laird, and D. B. Rubin (1977). Maximum likelihood from incomplete data via the EM algorithm., Journal of the Royal Statistical Society. Series B (Methodological)39, 1-38. · Zbl 0364.62022
[15] Diciccio, T. J., M. A. Martin, S. E. Stern, and G. A. Young (1996). Information bias and adjusted profile likelihoods., Journal of the Royal Statistical Society. Series B (Methodological58, 189-203. · Zbl 0834.62005
[16] Fitzmaurice, G., M. Davidian, G. Verbeke, and G. Molenberghs (2008)., Longitudinal Data Analysis. Chapman & Hall/CRC. · Zbl 1144.62087
[17] He, H. and T. Severini (2014). Integrated likelihood inference in semiparametric regression models., METRON - International Journal of Statistics72, 185-199. · Zbl 1316.62051
[18] Ibrahim, J. G., S. R. Lipsitz, and N. Horton (2001). Using auxiliary data for parameter estimation with non-ignorably missing outcomes., Journal of the Royal Statistical Society. Series C (Applied Statistics)50, 361-373. · Zbl 1112.62305
[19] Kenward, M. G. and G. Molenberghs (1998). Likelihood based frequentist inference when data are missing at random., Statistical Science13, 236-247. · Zbl 1099.62503
[20] Lancaster, T. (2000). The incidental parameter problem since 1948., Journal of Econometrics95, 391-413. · Zbl 0967.62099
[21] Lee, Y. and J. Nelder (2004). Conditional and Marginal Models: Another View., Statistical Science19, 219-238. · Zbl 1100.62591
[22] Liang, K.-Y. and S. L. Zeger (1986). Longitudinal data analysis using generalized linear models., Biometrika73(1), 13-22. · Zbl 0595.62110
[23] Little, R. J., D. B. Rubin, and S. Z. Zangeneh (2017). Conditions for ignoring the missing-data mechanism in likelihood inferences for parameter subsets., Journal of the American Statistical Association112, 314-320.
[24] Little, R. J. A. and D. B. Rubin (2002)., Statistical Analysis with Missing Data (2nd ed.). Wiley, New York. · Zbl 1011.62004
[25] McCullagh, P. and R. Tibshirani (1990). A simple method for the adjustment of profile likelihoods., Journal of the Royal Statistical Society. Series B (Methodological)52, 325-344. · Zbl 0716.62039
[26] Molenberghs, G. and G. Verbeke (2005)., Models for Discrete Longitudinal Data. Springer, New York. · Zbl 1093.62002
[27] Nelder, J. A. and R. Mead (1965). A simplex method for function minimization., The Computer Journal7, 308-313. · Zbl 0229.65053
[28] Neyman, J. and E. Scott (1948, January). Consistent estimates based on partially consistent observations., Econometrica16, 1-32. · Zbl 0034.07602
[29] Pace, L. and A. Salvan (1997)., Principles of Statistical Inference from a Neo-Fisherian Perspective. World Scientific Publishing, Singapore. · Zbl 0911.62003
[30] Parzen, M., S. R. Lipsitz, G. M. Fitzmaurice, J. G. Ibrahim, and A. Troxel (2006). Pseudo-likelihood methods for longitudinal binary data with non-ignorable missing responses and covariates., Statistics in Medicine25, 2784-2796.
[31] Pierce, D. A. and R. Bellio (2006). Effects of the reference set on frequentist inferences., Biometrika93, 425-438. · Zbl 1153.62001
[32] Pierce, D. A. and R. Bellio (2015). Beyond first-order asymptotics for Cox regression., Bernoulli21, 401-419. · Zbl 1388.62051
[33] R Core Team (2017)., R: A Language and Environment for Statistical Computing. Vienna, Austria: R Foundation for Statistical Computing.
[34] Rubin, D. B. (1976). Inference and missing data., Biometrika63, 581-592. · Zbl 0344.62034
[35] Sartori, N. (2003). Modified profile likelihoods in models with stratum nuisance parameters., Biometrika90, 533-549. · Zbl 1436.62086
[36] Severini, T. A. (1998). An approximation to the modified profile likelihood function., Biometrika85, 403-411. · Zbl 1048.62504
[37] Severini, T. A. (2000)., Likelihood Methods in Statistics. Oxford University Press. · Zbl 0984.62002
[38] Severini, T. A. (2007). Integrated likelihood functions for non-Bayesian inference., Biometrika94, 529-542. · Zbl 1134.62011
[39] Sinha, S. K., A. B. Troxel, S. R. Lipsitz, D. Sinha, G. M. Fitzmaurice, G. Molenberghs, and J. G. Ibrahim (2011). A bivariate pseudolikelihood for incomplete longitudinal binary data with nonignorable nonmonotone missingness., Biometrics67, 1119-1126. · Zbl 1226.62129
[40] Therneau, T. M. (2015)., survival: A Package for Survival Analysis in S. R package version 2.38.
[41] Troxel, A. B., D. P. Harrington, and S. R. Lipsitz (1998). Analysis of longitudinal data with non-ignorable non-monotone missing values., Journal of the Royal Statistical Society. Series C (Applied Statistics)47, 425-438. · Zbl 0905.62113
[42] Troxel, A. B., S. R. Lipsitz, and D. P. Harrington (1998). Marginal models for the analysis of longitudinal measurements with nonignorable non-monotone missing data., Biometrika85, 661-672. · Zbl 0918.62088
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. It attempts to reflect the references listed in the original paper as accurately as possible without claiming the completeness or perfect precision of the matching.