Missing information principle: a unified approach for general truncated and censored survival data problems.

*(English)*Zbl 1397.62376Summary: It is well known that truncated survival data are subject to sampling bias, where the sampling weight depends on the underlying truncation time distribution. Recently, there has been a rising interest in developing methods to better exploit the information about the truncation time, thus the sampling weight function, to obtain more efficient estimation. In this paper, we propose to treat truncation and censoring as “missing data mechanism” and apply the missing information principle to develop a unified framework for analyzing left-truncated and right-censored data with unspecified or known truncation time distributions. Our framework is structured in a way that is easy to understand and enjoys a great flexibility for handling different types of models. Moreover, a new test for checking the independence between the underlying truncation time and survival time is derived along the same line. The proposed hypothesis testing procedure utilizes all observed data and hence can yield a much higher power than the conditional Kendall’s tau test that only involves comparable pairs of observations under truncation. Simulation studies with practical sample sizes are conducted to compare the performance of the proposed method with its competitors. The proposed methodologies are applied to a dementia study and a nursing house study for illustration.

##### MSC:

62N01 | Censored data models |

62B10 | Statistical aspects of information-theoretic topics |

62G07 | Density estimation |

62P10 | Applications of statistics to biology and medical sciences; meta analysis |

##### Keywords:

Kendall’s tau; inverse probability weighted estimator; outcome-dependent sampling; prevalent sampling; self-consistency algorithm**OpenURL**

##### References:

[1] | Addona, V. and Wolfson, D. B. (2006). A formal test for the stationarity of the incidence rate using data from a prevalent cohort study with follow-up. Lifetime Data Anal.12 267–284. · Zbl 1356.62059 |

[2] | Andersen, P. K., Borgan, Ø., Gill, R. D. and Keiding, N. (1993). Statistical Models Based on Counting Processes. Springer, New York. · Zbl 0769.62061 |

[3] | Asgharian, M., M’Lan, C. E. and Wolfson, D. B. (2002). Length-biased sampling with right censoring: An unconditional approach. J. Amer. Statist. Assoc.97 201–209. · Zbl 1073.62561 |

[4] | Bartlett, M. S. (1937). Some examples of statistical methods of research in agriculture and applied biology. J. Roy. Statist. Soc. Ser. B4 137–183. |

[5] | Begun, J. M., Hall, W. J., Huang, W.-M. and Wellner, J. A. (1983). Information and asymptotic efficiency in parametric–nonparametric models. Ann. Statist.11 432–452. · Zbl 0526.62045 |

[6] | Bhattacharya, P. K., Chernoff, H. and Yang, S. S. (1983). Nonparametric estimation of the slope of a truncated regression. Ann. Statist.11 505–514. · Zbl 0522.62031 |

[7] | Dempster, A. P., Laird, N. M. and Rubin, D. B. (1977). Maximum likelihood from incomplete data via the EM algorithm. J. Roy. Statist. Soc. Ser. B39 1–38. · Zbl 0364.62022 |

[8] | Huang, C.-Y., Ning, J. and Qin, J. (2015). Semiparametric likelihood inference for left-truncated and right-censored data. Biostatistics16 785–798. |

[9] | Huang, C.-Y. and Qin, J. (2012). Composite partial likelihood estimation under length-biased sampling, with application to a prevalent cohort study of dementia. J. Amer. Statist. Assoc.107 946–957. · Zbl 1299.62123 |

[10] | Hyde, J. (1977). Testing survival under right censoring and left truncation. Biometrika64 225–230. |

[11] | Kendall, M. and Gibbons, J. D. (1990). Rank Correlation Methods, 5th ed. Edward Arnold, London. · Zbl 0732.62057 |

[12] | Lancaster, T. (1990). The Econometric Analysis of Transition Data. Econometric Society Monographs17. Cambridge Univ. Press, Cambridge. · Zbl 0717.62106 |

[13] | Luo, X. and Tsai, W. Y. (2009). Nonparametric estimation for right-censored length-biased data: A pseudo-partial likelihood approach. Biometrika96 873–886. · Zbl 1179.62142 |

[14] | Lynden-Bell, D. (1971). Article navigation a method of allowing for known observational selection in small samples applied to 3CR quasars. Mon. Not. R. Astron. Soc.155 95–118. |

[15] | Martin, E. C. and Betensky, R. A. (2005). Testing quasi-independence of failure and truncation times via conditional Kendall’s tau. J. Amer. Statist. Assoc.100 484–492. · Zbl 1117.62397 |

[16] | McDowell, I., Hill, G. and Lindsay, J. (2001). An overview of the Canadian study of health and aging. Int. Psychogeriatr.13 1–18. |

[17] | Murphy, S. A. and van der Vaart, A. W. (2000). On profile likelihood. J. Amer. Statist. Assoc.95 449–485. · Zbl 0995.62033 |

[18] | Ning, J., Qin, J. and Shen, Y. (2014). Score estimating equations from embedded likelihood functions under accelerated failure time model. J. Amer. Statist. Assoc.109 1625–1635. · Zbl 1368.62265 |

[19] | Oakes, D. (2008). On consistency of Kendall’s tau under censoring. Biometrika95 997–1001. · Zbl 1323.62097 |

[20] | Orchard, T. and Woodbury, M. A. (1972). A missing information principle: Theory and applications. In Proceedings of the Sixth Berkeley Symposium on Mathematical Statistics and Probability (Univ. California, Berkeley, Calif., 1970/1971), Vol. I: Theory of Statistics 697–715. Univ. California Press, Berkeley, CA. |

[21] | Qin, J., Ning, J., Liu, H. and Shen, Y. (2011). Maximum likelihood estimations and EM algorithms with length-biased data. J. Amer. Statist. Assoc.106 1434–1449. · Zbl 1234.62128 |

[22] | Shen, Y., Ning, J. and Qin, J. (2017). Nonparametric and semiparametric regression estimation for length-biased survival data. Lifetime Data Anal.23 3–24. · Zbl 1396.62233 |

[23] | Tsai, W.-Y. (1990). Testing the assumption of independence of truncation time and failure time. Biometrika77 169–177. · Zbl 0692.62045 |

[24] | Tsai, W. Y. (2009). Pseudo-partial likelihood for proportional hazards models with biased-sampling data. Biometrika96 601–615. · Zbl 1170.62072 |

[25] | Tsai, W.-Y., Jewell, N. P. and Wang, M.-C. (1987). A note on the product-limit estimator under right censoring and left truncation. Biometrika74 883–886. · Zbl 0628.62101 |

[26] | Turnbull, B. W. (1976). The empirical distribution function with arbitrarily grouped, censored and truncated data. J. Roy. Statist. Soc. Ser. B38 290–295. · Zbl 0343.62033 |

[27] | Vardi, Y. (1989). Multiplicative censoring, renewal processes, deconvolution and decreasing density: Nonparametric estimation. Biometrika76 751–761. · Zbl 0678.62051 |

[28] | Vardi, Y. and Zhang, C.-H. (1992). Large sample study of empirical distributions in a random-multiplicative censoring model. Ann. Statist.20 1022–1039. · Zbl 0761.62056 |

[29] | Wang, M.-C. (1991). Nonparametric estimation from cross-sectional survival data. J. Amer. Statist. Assoc.86 130–143. · Zbl 0739.62026 |

[30] | Wang, M.-C. (1996). Hazards regression analysis for length-biased data. Biometrika83 343–354. · Zbl 0864.62080 |

[31] | Wang, M.-C., Brookmeyer, R. and Jewell, N. P. (1993). Statistical models for prevalent cohort data. Biometrics49 1–11. · Zbl 0771.62079 |

[32] | Yates, F. (1933). The analysis of replicated experiments when the field results are incomplete. Emp. J. Exp. Agric.1 129–142. |

This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. It attempts to reflect the references listed in the original paper as accurately as possible without claiming the completeness or perfect precision of the matching.