zbMATH — the first resource for mathematics

Testing for association in contingency tables with multiple column responses. (English) Zbl 1058.62534
Summary: In many studies, multiple categorical responses or measurements are made on members of different populations or treatment groups. This arises often in surveys where individuals may mark all answers that apply when responding to a multiple-choice question. Frequently, it is of interest to determine whether the distributions of responses differ among groups. In this situation, the test statistic of the usual Pearson chi-square test no longer measures a scaled distance between observed and hypothesized cell counts in a contingency table, and its distribution is no longer the familiar chi-square.
This paper presents a modification to the Pearson statistic that measures the appropriate distance for multiple-response tables. The asymptotic distribution is shown to be that of a linear combination of chi-square random variables with coefficients depending on the true probabilities. A bootstrap resampling method is proposed instead to obtain a null-hypothesis sampling distribution. Simulations show that this bootstrap method maintains its size under a variety of circumstances, while a naively applied Pearson chi-square test is severely affected by multiple responses.

62H17 Contingency tables
62G10 Nonparametric hypothesis testing
62E20 Asymptotic distribution theory in statistics
62G09 Nonparametric statistical resampling methods
AS 183
Full Text: DOI