×

Relaxation penalties and priors for plausible modeling of nonidentified bias sources. (English) Zbl 1328.62051

Summary: In designed experiments and surveys, known laws or design features provide checks on the most relevant aspects of a model and identify the target parameters. In contrast, in most observational studies in the health and social sciences, the primary study data do not identify and may not even bound target parameters. Discrepancies between target and analogous identified parameters (biases) are then of paramount concern, which forces a major shift in modeling strategies. Conventional approaches are based on conditional testing of equality constraints, which correspond to implausible point-mass priors. When these constraints are not identified by available data, however, no such testing is possible. In response, implausible constraints can be relaxed into penalty functions derived from plausible prior distributions. The resulting models can be fit within familiar full or partial likelihood frameworks. { } The absence of identification renders all analyses part of a sensitivity analysis. In this view, results from single models are merely examples of what might be plausibly inferred. Nonetheless, just one plausible inference may suffice to demonstrate inherent limitations of the data. Points are illustrated with misclassified data from a study of sudden infant death syndrome. Extensions to confounding, selection bias and more complex data structures are outlined.

MSC:

62C10 Bayesian problems; characterization of Bayes procedures
62P10 Applications of statistics to biology and medical sciences; meta analysis
92C50 Medical applications (general)

Software:

ElemStatLearn

References:

[1] Baker, S. G. (1996). The analysis of categorical case-control data subject to nonignorable nonresponse. Biometrics 52 362-369. · Zbl 1132.68649 · doi:10.1016/j.patcog.2007.08.008
[2] Bedrick, E. J., Christensen, R. and Johnson, W. (1996). A new perspective on generalized linear models. J. Amer. Statist. Assoc. 91 1450-1460. JSTOR: · Zbl 0882.62057 · doi:10.2307/2291571
[3] Bishop, Y. M. M., Fienberg, S. E. and Holland, P. W. (1975). Discrete Multivariate Analysis: Theory and Practice . MIT Press, Cambridge, MA. · Zbl 0332.62039
[4] Box, G. E. P. (1980). Sampling and Bayes inference in scientific modeling and robustness. J. Roy. Statist. Soc. Ser. A 143 383-430. JSTOR: · Zbl 0471.62036 · doi:10.2307/2982063
[5] Bross, I. D. J. (1967). Pertinency of an extraneous variable. Journal of Chronic Diseases 20 487-495.
[6] Brumback, B. A., Hernan, M. A., Haneuse, S. and Robins, J. M. (2004). Sensitivity analyses for unmeasured confounding assuming a marginal structural model for repeated measures. Statist. Med. 23 749-767.
[7] Bull, S. B., Lewinger, J. B. and Lee, S. S. F. (2007). Confidence intervals for multinomial logistic regression in sparse data. Statist. Med. 26 903-918. · doi:10.1002/sim.2518
[8] Carroll, R. J., Ruppert, D., Stefanski, L. A. and Crainiceanu, C. (2006). Measurement Error in Nonlinear Models , 2nd ed. Chapman and Hall, Boca Raton, FL. · Zbl 1119.62063 · doi:10.1201/9781420010138
[9] Copas, J. B. (1999). What works? Selectivity models and meta-analysis. J. R. Stat. Soc. Ser. B 162 95-109.
[10] Cox, D. R. (1975). A note on partially Bayes inference and the linear model. Biometrika 62 651-654. JSTOR: · Zbl 0324.62029 · doi:10.1093/biomet/62.3.651
[11] Deely, J. J. and Lindley, D. V. (1981). Bayes empirical Bayes. J. Amer. Statist. Assoc. 76 833-841. JSTOR: · Zbl 0495.62009 · doi:10.2307/2287578
[12] Drews, C., Kraus, J. F. and Greenland, S. (1990). Recall bias in a case-control study of sudden infant death syndrome. International Journal of Epidemiology 19 405-411.
[13] Eddy, D. M., Hasselblad, V. and Shachter, R. (1992). Meta-Analysis by the Confidence Profile Method . Academic Press, New York.
[14] Espeland, M. and Hui, S. L. (1987). A general approach to analyzing epidemiologic data that contain misclassification errors. Biometrics 43 1001-1012. · Zbl 0715.62220 · doi:10.2307/2531553
[15] Fortes, C., Mastroeni, S., Melchi, F., Pilla, M. A., Antonelli, G., Camaioni, D., Alotto, M. and Pasquini, P. (2008). A protective effect of the Mediterranean diet for cutaneous melanoma. International Journal of Epidemiology 37 1018-1029.
[16] Gelfand, A. E. and Sahu, S. K. (1999). Identifiability, improper priors, and Gibbs sampling for generalized linear models. J. Amer. Statist. Assoc. 94 247-253. JSTOR: · Zbl 1072.62611 · doi:10.2307/2669699
[17] Geneletti, S., Ricequalityson, S. and Best, N. (2009). Adjusting for selection bias in retrospective case-control studies. Biostatistics 10 17-31.
[18] Good, I. J. (1983). Good Thinking . Univ. Minnesota Press, Minneapolis. · Zbl 0583.60001
[19] Goubar, A., Aedes, A. E., DeAngelis, D., McGarrigle, C. A., Mercer, C. H., Tookey, P. A., Fenton, K. and Gill, O. N. (2008). Estimates of human immunodeficiency virus prevalence and proportion diagnosed based on Bayesian multiparameter synthesis of surveillance data (with discussion). J. Roy. Statist. Soc. Ser. A 171 541-580. · doi:10.1111/j.1467-985X.2007.00537.x
[20] Greenland, S. (1992). A semi-Bayes approach to the analysis of correlated associations, with an application to an occupational cancer-mortality study. Statist. Med. 11 219-230.
[21] Greenland, S. (2000). When should epidemiologic regressions use random coefficients? Biometrics 56 915-921. · Zbl 1060.62618 · doi:10.1111/j.0006-341X.2000.00915.x
[22] Greenland, S. (2003a). The impact of prior distributions for uncontrolled confounding and response bias: A case study of the relation of wire codes and magnetic fields to childhood leukemia. J. Amer. Statist. Assoc. 98 47-54. · Zbl 1047.62106 · doi:10.1198/01621450338861905
[23] Greenland, S. (2003b). Generalized conjugate priors for Bayesian analysis of risk and survival regressions. Biometrics 59 92-99. JSTOR: · Zbl 1210.62026 · doi:10.1111/1541-0420.00011
[24] Greenland, S. (2003c). Quantifying biases in causal models: Classical confounding versus collider-stratification bias. Epidemiology 14 300-306.
[25] Greenland, S. (2005a). Multiple-bias modeling for analysis of observational data (with discussion). J. Roy. Statist. Soc. Ser. A 168 267-308. JSTOR: · Zbl 1099.62129 · doi:10.1111/j.1467-985X.2004.00349.x
[26] Greenland, S. (2005b). Contribution to discussion of Prentice, Pettinger, and Anderson. Biometrics 61 920-921. · doi:10.1111/j.0006-341X.2005.454_6.x
[27] Greenland, S. (2006). Bayesian perspectives for epidemiologic research. I. Foundations and basic methods (with comment and reply). International Journal of Epidemiology 35 765-778.
[28] Greenland, S. (2007a). Bayesian perspectives for epidemiologic research. II. Regression analysis. International Journal of Epidemiology 36 195-202.
[29] Greenland, S. (2007b). Prior data for non-normal priors. Statist. Med. 26 3578-3590. · doi:10.1002/sim.2788
[30] Greenland, S. (2007c). Maximum-likelihood and closed-form estimators of epidemiologic measures under misclassification. J. Statist. Plann. Inference 138 528-538. · Zbl 1134.62036 · doi:10.1016/j.jspi.2007.06.012
[31] Greenland, S. (2009). Bayesian perspectives for epidemiologic research III. Bias analysis via missing data methods. International Journal of Epidemiology 38 1662-1673.
[32] Greenland, S., Gago-Domiguez, M. and Castellao, J. E. (2004). The value of risk-factor (“black-box”) epidemiology (with discussion). Epidemiology 15 519-535.
[33] Greenland, S. and Kheifets, L. (2006). Leukemia attributable to residential magnetic fields: Results from analyses allowing for study biases. Risk Analysis 26 471-482.
[34] Greenland, S. and Lash, T. L. (2008). Bias analysis. In Modern Epidemiology , 3rd ed. (K. J. Rothman, S. Greenland and T. L. Lash, eds.) Chapter 19, 345-380. Lippincott-Williams-Wilkins, Philadelphia.
[35] Greenland, S. and Maldonado, G. (1994). The interpretation of multiplicative model parameters as standardized parameters. Statist. Med. 13 989-999.
[36] Gustafson, P. (2003). Measurement Error and Misclassification in Statistics and Epidemiology: Impacts and Bayesian Adjustments . Chapman and Hall/CRC Press, Boca Raton. · Zbl 1039.62019
[37] Gustafson, P. (2005). On model expansion, model contraction, identifiability, and prior information: Two illustrative scenarios involving mismeasured variables (with discussion). Statist. Sci. 20 111-140. · Zbl 1087.62037 · doi:10.1214/088342305000000098
[38] Gustafson, P. and Greenland, S. (2006). The performance of random coefficient regression in accounting for residual confounding. Biometrics 62 760-768. · Zbl 1127.62024 · doi:10.1111/j.1541-0420.2005.00510.x
[39] Gustafson, P. and Greenland, S. (2010). Interval estimation for messy observational data. · Zbl 1329.62133 · doi:10.1214/09-STS305
[40] Gustafson. P., Le, N. D. and Saskin, R. (2001). Case-control analysis with partial knowledge of exposure misclassification probabilities. Biometrics 57 598-609. JSTOR: · Zbl 1209.62294 · doi:10.1111/j.0006-341X.2001.00598.x
[41] Hastie, T. and Tibshirani, R. (1990). Generalized Additive Models . Chapman and Hall, New York. · Zbl 0747.62061
[42] Hastie, T., Tibshirani, R. and Friedman, J. (2001). The Elements of Statistical Learning: Data Mining, Inference, and Prediction . Springer, New York. · Zbl 0973.62007
[43] Higgins, J. P. T. and Spiegelhalter, D. J. (2002). Being skeptical about meta-analyses: A Bayesian perspective on magnesium trials in myocardial infarction. International Journal of Epidemiology 31 96-104, appendix.
[44] Hui, S. L. and Walter, S. D. (1980). Estimating the error rates of diagnostic tests. Biometrics 36 167-171. · Zbl 0436.62093 · doi:10.2307/2530508
[45] Johnson, W. O., Gastwirth, J. L. and Pearson, L. M. (2001). Screening without a “Gold Standard”: The Hui-Walter Paradigm revisited. American Journal of Epidemiology 153 921-924.
[46] Jones, M. C. (2004). Families of distributions arising from distributions of order statistics. Test 13 1-44. · Zbl 1110.62012 · doi:10.1007/BF02602999
[47] Joseph, L., Gyorkos, T. W. and Coupal, L. (1995). Bayesian estimation of disease prevalence and parameters for diagnostic tests in the absence of a gold standard. American Journal of Epidemiology 141 263-272.
[48] Kadane, J. B. (1993). Subjective Bayesian analysis for surveys with missing data. The Statistician 42 415-426. Erratum (1996): The Statistician 45 539.
[49] Kraus, J. F., Greenland, S. and Bulterys, M. G. (1989). Risk factors for sudden infant death syndrome in the U.S. Collaborative Perinatal Project. International Journal of Epidemiology 18 113-120.
[50] Lash, T. L. and Fink, A. K. (2003). Semi-automated sensitivity analysis to assess systematic errors in observational epidemiologic data. Epidemiology 14 451-458.
[51] Lawlor, D. A., Davey Smith, G., Bruckdorfer, K. R., Kundu, D. and Ebrahim, S. (2004). Those confounded vitamins: What can we learn from the differences between observational versus randomized trial evidence? Lancet 363 1724-1727.
[52] Leamer, E. E. (1974). False models and post-data model construction. J. Amer. Statist. Assoc. 69 122-131. · Zbl 0283.62056 · doi:10.2307/2285510
[53] Leonard, T. and Hsu, J. S. J. (1999). Bayesian Methods . Cambridge University Press, Cambridge. · Zbl 0930.62023
[54] Little, R. J. A. and Rubin, D. B. (2002). Statistical Analysis with Missing Data , 2nd ed. Wiley, New York. · Zbl 1011.62004
[55] Lyles, R. H. (2002). A note on estimating crude odds ratios in case-control studies with differentially misclassified exposure. Biometrics 58 1034-1037. JSTOR: · Zbl 1210.62186 · doi:10.1111/j.0006-341X.2002.1034_1.x
[56] Maldonado, G. (2008). Adjusting a relative-risk estimate for study imperfections. Journal of Epidemiology and Community Health 62 655-663.
[57] McCandless, L. C., Gustafson, P. and Levy, A. (2007). Bayesian sensitivity analysis for unmeasured confounding in observational studies. Statist. Med. 26 2331-2347. · doi:10.1002/sim.2711
[58] McLachlan, G. J. and Krishnan, T. (1997). The EM Algorithm and Extensions . Wiley, New York. · Zbl 0882.62012
[59] Messer, K. and Natarajan, L. (2008). Maximum likelihood, multiple imputation and regression calibration for measurement error adjustment. Statist. Med. 27 6332-6350.
[60] Molenberghs, G., Kenward, M. G. and Goetghebeur, E. (2001). Sensitivity analysis for incomplete contingency tables. Appl. Statist. 50 15-29. · Zbl 1021.62045 · doi:10.1111/1467-9876.00217
[61] Molitor, J., Jackson, C., Best, N. B. and Ricequalityson, S. (2008). Using Bayesian graphical models to model biases in observational studies and to combine multiple data sources: Application to low birthweight and water disinfection by-products. J. Roy. Statist. Soc. Ser. A 172 615-638.
[62] Neath, A. A. and Samaniego, F. J. (1997). On the efficacy of Bayesian inference for nonidentifiable models. Amer. Statist. 51 225-232. JSTOR: · doi:10.2307/2684892
[63] Phillips, C. V. (2003). Quantifying and reporting uncertainty from systematic errors. Epidemiology 14 459-466.
[64] Robins, J. M., Rotnitzky, A. and Scharfstein, D. O. (2000). Sensitivity analysis for selection bias and unmeasured confounding in missing data and causal inference models. In Statistical Models in Epidemiology, the Environment, and Clinical Trials (Minneapolis, MN, 1997). IMA Vol. Math. Appl. 116 1-94. Springer, New York. · Zbl 0998.62091
[65] Rosenbaum, P. R. (1999). Choice as an alternative to control in observational studies (with discussion). Statist. Sci. 14 259-304. · Zbl 1059.62699 · doi:10.1214/ss/1009212410
[66] Rosenbaum, P. R. (2002). Observational Studies , 2nd ed. Springer, New York. · Zbl 0985.62091
[67] Samaniego, F. J. and Neath, A. A. (1996). How to be a better Bayesian. J. Amer. Statist. Assoc. 91 733-742. JSTOR: · Zbl 0869.62006 · doi:10.2307/2291668
[68] Scharfstein, D. O., Rotnitsky, A. and Robins, J. M. (1999). Adjusting for nonignorable drop-out using semiparametric nonresponse models. J. Amer. Statist. Assoc. 94 1096-1120. JSTOR: · Zbl 1072.62644 · doi:10.2307/2669923
[69] Scharfstein, D. O., Daniels, M. J. and Robins, J. M. (2003). Incorporating prior beliefs about selection bias into the analysis of randomized trials with missing outcomes. Biostatistics 4 495-512. · Zbl 1154.62401 · doi:10.1093/biostatistics/4.4.495
[70] Small, D. R. and Rosenbaum, P. R. (2009). Error-free milestones in error-prone measurements. Ann. Appl. Statist. · Zbl 1196.62048 · doi:10.1214/08-AOAS233
[71] Tibshirani, R. (1996). Regression shrinkage and selection via the Lasso. J. Roy. Statist. Soc. Ser. B 58 267-288. JSTOR: · Zbl 0850.62538
[72] Titterington, D. M. (1985). Common structure of smoothing techniques in statistics. Internat. Statist. Rev. 53 141-170. JSTOR: · Zbl 0569.62026 · doi:10.2307/1402932
[73] Turner, R. M., Spiegelhalter, D. J., Smith, G. C. S. and Thompson, S. G. (2009). Bias modeling in evidence synthesis. J. Roy. Statist. Soc. Ser. A 172 21-47.
[74] Vansteelandt, S., Goetghebeur, E., Kenward, M. G. and Molenberghs, G. (2006). Ignorance and uncertainty regions as inferential tools in a sensitivity analysis. Statist. Sinica 16 953-980. · Zbl 1108.62005
[75] Walker, A. M. (1982). Anamorphic analysis: Sampling and estimation for covariate effects when both exposure and disease are known. Biometrics 38 1025-1032.
[76] Welton, N. J., Ades, A. E., Carlin, J. B., Altman, D. G. and Sterne, J. B. (2009). Models for potentially biased evidence in meta-analysis using empirically based priors. J. Roy. Statist. Soc. Ser. A 172 119-136.
[77] Werler, M. M., Pober, B. R., Nelson, K. and Holmes, L. B. (1989). Reporting accuracy among mothers of malformed and nonmalformed infants. American Journal of Epidemiology 129 415-421.
[78] White, J. E. (1982). A two-stage design for the study of the relationship between a rare exposure and a rare disease. American Journal of Epidemiology 115 119-128.
[79] Yanagawa, T. (1984). Case-control studies: Assessing the effect of a confounding factor. Biometrika 71 191-194. JSTOR: · Zbl 0532.62087 · doi:10.1093/biomet/71.1.191
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.