Analyzing bivariate continuous data grouped into categories defined by empirical quantiles of marginal distributions.

*(English)*Zbl 0896.62114Summary: Epidemiologists sometimes study the association between two measurements of exposure on the same subjects by grouping the original bivariate continuous data into categories that are defined by the empirical quantiles of the two marginal distributions. Although such grouped data are presented in a two-way contingency table, the cell counts in this table do not have a multinomial distribution. We describe the joint distribution of counts in such a table by the term empirical bivariate quantile-partitioned (EBQP) distribution. N. Blomqvist [Ann. Math. Statistics 21, 593-600 (1950; Zbl 0040.22403)] gave an asymptotic EBQP theory for bivariate data partitioned by the sample medians. We demonstrate that his asymptotic theory is not correct, however, except in special cases. We present a general asymptotic theory for tables of arbitrary dimensions and apply this theory to construct confidence intervals for the kappa statistic. We show by simulations that the confidence interval procedures we propose have near nominal coverage for sample sizes exceeding 60 for both \(2\times 2\) and \(3\times 3\) tables. These simulations also illustrate that the asymptotic theory of Blomqvist and the methods of J. L. Fleiss, J. Cohen and B. S. Everitt [Psychol. Bull. 72, 323-327 (1969)] for multinomial tables can yield subnominal coverage for kappa calculated from EBQP tables, although in some cases the coverage for these procedures is near nominal levels.

##### MSC:

62P10 | Applications of statistics to biology and medical sciences; meta analysis |

62E20 | Asymptotic distribution theory in statistics |

62H17 | Contingency tables |

62G15 | Nonparametric tolerance and confidence regions |