Breslow, Norman E.; Lumley, Thomas Semiparametric models and two-phase samples: applications to Cox regression. (English) Zbl 1347.60008 Banerjee, M. (ed.) et al., From probability to statistics and back: high-dimensional models and processes. A Festschrift in honor of Jon A. Wellner. Including papers from the conference, Seattle, WA, USA, July 28–31, 2010. Beachwood, OH: IMS, Institute of Mathematical Statistics (ISBN 978-0-940600-83-6). Institute of Mathematical Statistics Collections 9, 65-77 (2013). Summary: A standard estimation method when fitting parametric models to data from two-phase stratified samples is inverse probability weighting of the estimating equations. In previous work, we applied this approach to likelihood equations for both Euclidean and non-Euclidean parameters in semi-parametric models. We proved weak convergence of the inverse probability weighted empirical process and derived an asymptotic expansion for the estimator of the Euclidean parameter. We also showed how adjustment of the sampling weights by their calibration to known totals of auxiliary variables, or their estimation using these same variables, could markedly improve efficiency.Here we consider joint estimation of Euclidean and non-Euclidean parameters. Our asymptotic expansion for the non-Euclidean parameter is apparently new even in the special case of simple random sampling. The results are applied to estimation of survival probabilities for individual subjects using the regression coefficients (log hazard ratios) and baseline cumulative hazard function of the Cox proportional hazards model. Expressions derived for the variances of regression coefficients and cumulative hazards estimated after calibration of the weights aid the construction of the auxiliary variables used for the adjustment. We demonstrate empirically the improvement offered by calibration or estimation of the weights via simulation of two-phase stratified samples using publicly available data from the National Wilms Tumor Study and data analysis with the \(R\) survey package.For the entire collection see [Zbl 1319.62002]. Cited in 3 Documents MSC: 60F05 Central limit and other weak theorems 60F17 Functional limit theorems; invariance principles 62F12 Asymptotic properties of parametric estimators 60J65 Brownian motion 60J70 Applications of Brownian motions and diffusion theory (population genetics, absorption problems, etc.) 65C60 Computational problems in statistics (MSC2010) Keywords:semiparametric models; estimation; asymptotic distributions; asymptotic efficiency; Cox regression; calibration; empirical processes; survival analysis; stratified sampling Software:Survey × Cite Format Result Cite Review PDF Full Text: DOI