Optimal sequential designs of case-control studies. (English) Zbl 1105.62364

Summary: Fixed case-control studies separately collect a case sample and a control sample with the two sample sizes being fixed prior to studies and sometimes arbitrarily chosen. This often results in loss of efficiency of case-control designs in terms of cost-saving or time-saving of the studies. We study sequential case-control designs and, in connection with treatment allocation and stochastic approximation, derive a simple sampling rule that leads to optimal case-control designs. Some important issues such as fixed-width confidence intervals and sequential tests of hypotheses with possible early stopping to save time or costs, which cannot be answered with fixed case-control designs, are shown to be naturally solved with the derived optimal sequential case-control designs.


62L05 Sequential statistical design
62L20 Stochastic approximation
62L15 Optimal stopping in statistics
62L10 Sequential statistical analysis
Full Text: DOI


[1] Anderson, J. A. (1972). Separate sample logistic discrimination. Biometrika 59 19-35. JSTOR: · Zbl 0231.62080 · doi:10.1093/biomet/59.1.19
[2] Blum, J. R. and Rosenblatt, J. (1966). On some statistical problems requiring purely sequential sampling schemes. Ann Inst. Statist. Math. 18 351-355. · Zbl 0144.42002 · doi:10.1007/BF02869542
[3] Boston Collaborative Drug Surveillance Project (1973). Oral contraceptives and venous thrombolic disease, surgically confirmed gallabladder disease and breast tumors. Lancet 1 1399-1404.
[4] Breslow, N. E. (1996). Statistics in epidemiology: the case-control study. J. Amer. Statist. Assoc. 91 14-28. JSTOR: · Zbl 0870.62082 · doi:10.2307/2291379
[5] Breslow, N. E. and Day, N. E. (1980). Statistical Methods in Cancer Research 1. The Design and Analysis of Case-Control Studies. IARC, Lyon.
[6] Chang, Y.-C. I. and Martinsek, A. T. (1992). Fixed size confidence regions for parameters of a logistic regression model. Ann. Statist. 20 1953-1969. · Zbl 0765.62075 · doi:10.1214/aos/1176348897
[7] Chen, K., Jing, B. Y. and Ying, Z. (1999). An asymptotic theory for maximum likelihood estimator in case-control logistic regression. Unpublished manuscript.
[8] Chow, Y. S. and Robbins, H. (1965). On the asymptotic theory of fixed-width sequential confidence intervals for the mean. Ann. Math. Statist. 36 457-462. · Zbl 0142.15601 · doi:10.1214/aoms/1177700156
[9] Efron, B. (1971). Forcing a sequential experiment to be balanced. Biometrika 58 403-417. JSTOR: · Zbl 0226.62086 · doi:10.1093/biomet/58.3.403
[10] Heinonen, O. P., Shapiro, S., Tuominen, L. T. and Turvnen, M. I. (1974). Reserpine use in relation to breast cancer. Lancet 2 675-677.
[11] Khan, R. A. (1969). A general method of determining fixed-width confidence intervals. Ann. Math. Statist. 40 704-709. · Zbl 0176.48902 · doi:10.1214/aoms/1177697747
[12] Lai, T. L. and Robbins, H. (1979). Adaptive design and stochastic approximation. Ann. Statist. 7 1196-1221. · Zbl 0426.62059 · doi:10.1214/aos/1176344840
[13] Lai, T. L. and Robbins, H. (1981). Consistency and asymptotic efficiency of slope estimates in stochastic approximation scheme. Probab. Theory Related Fields 56 329-360. Mack, T. M., Henderson, B. E., Gerkins, V. R., Arthur, M., Baptista, J. and Pike, M. C. · Zbl 0472.62089 · doi:10.1007/BF00536178
[14] . Reserpine and breast cancer in a retirement community. New. England J. Med. 292 1366-1371.
[15] O’Neill, R. T. (1983). Sample size for estimation of the odds ratio in unmatched case-control studies. Amer. J. Epidemiology 120 145-153.
[16] O’Neill, R. T. (1998). Case-control study, sequential. Encyclopedia of Biostatistics (P. Armitage and T. Colton, eds.) 1 528-532. Wiley, New York.
[17] O’Neill, R. T. and Anello, C. (1978). Case-control studies: a sequential approach. Amer. J. Epidemiology 108 415-424.
[18] Pasternack, B. S. and Shore, R. E. (1981). Sample sizes for individually matched case-control studies. Amer. J. Epidemiology 115 778-784.
[19] Pisier, G. (1983). Some applications of the metric entropy condition to harmonic analysis. Banach Spaces, Harmonic Analysis, and Probability Thoery. Lecture Notes in Math. 995 123- 154. Springer, New York. · Zbl 0517.60043 · doi:10.1007/BFb0061891
[20] Pollard, D. (1990). Empirical Processes: Theory and Applications. IMS, Hayward, CA. · Zbl 0741.60001
[21] Prentice, R. L. and Pyke, R. (1979). Logistic disease incidence models and case-control studies. Biometrika 66 403-411. JSTOR: · Zbl 0428.62078 · doi:10.1093/biomet/66.3.403
[22] Qin, J. and Zhang, B. (1997). A goodness-of-fit for logistic regression models based on case-control data. Biometrika 84 609-618. JSTOR: · Zbl 0888.62045 · doi:10.1093/biomet/84.3.609
[23] Robbins, H. and Monro, S. (1951). A stochastic approximation method. Ann. Math. Statist. 29 400-407. · Zbl 0054.05901 · doi:10.1214/aoms/1177729586
[24] Robins, J. M., Rotnitzky, A., and Zhao, L. P. (1994). Estimation of regression coefficients when some regressors are not always observed. J. Amer. Statist. Assoc. 89 846-866. JSTOR: · Zbl 0815.62043 · doi:10.2307/2290910
[25] Sartwell, P. E., Masi, A. T., Arthes, F. G., Greene, G. R. and Smith, H. E. (1969). Thromboembolism and oral contraceptives: an epidemiological case-control study. Amer. J. Epidemiology 90 365-380.
[26] Siegmund, D. (1985). Sequential Analysis: Tests and Confidence Intervals. Springer, New York. · Zbl 0573.62071
[27] Stein, C. (1945). A two-stage test for a linear hypothesis whose power is independent of the variance. Ann. Math. Statist. 16 243-258. Stampfer, M. J., Willett, W. C., Coldits, G. A., Roser, B., Speizer, F. E., and Hennekens, · Zbl 0060.30403 · doi:10.1214/aoms/1177731088
[28] C. H. (1985). A prospective study of postmenopausal estrogen therapy and coronary heart disease. New England J. Med. 313 1044-1049.
[29] Vessey, P. M. and Doll, D. R. (1968). Investigation of relation between use of oral contraceptive and thromboembolic disease. Brit. Med. J. 2 199-205.
[30] Wei, L. J. (1977). A class of designs for sequential clinical trials. J. Amer. Statist. Assoc. 72 382-386. Wu, C. F. J. (1985a). Efficient sequential designs withbinary data. J. Amer. Statist. Assoc. 80 974-984. Wu, C. F. J. (1985b). Maximum likelihood recursion and stochastic approximation in sequential designs. In Adaptive Statistical Procedures and Related Topics (J. van Ryzin, ed.) 298-313. IMS, Hayward, CA.
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.