Rank tests in unmatched clustered randomized trials applied to a study of teacher training. (English) Zbl 1411.62358

Summary: In the Teacher and Leader Performance Evaluation Systems study, schools were randomly assigned to receive new measures of teacher and principal performance. One outcome in the study, measured at the teacher level, was truncated at zero, and displayed a long tail. Rank-based statistics are one natural method to apply to such outcomes, since inferences will be robust and exact, and we can avoid assumptions about the model that generated the data. We investigate four different possible rank statistics that vary in the form of weighting applied to clusters. Each test statistic has the correct level but may vary in terms of the power to detect departures from the null. We conduct simulations for power comparing to linear mixed models with Normal, \(t\), and Cauchy errors. We obtain a point estimate and construct confidence intervals by applying the Tobit model of effects, which assumes that treatment increases the outcome by a constant amount but only if the response under control would be positive. We also develop a formal randomization-based method for testing the appropriateness of the Tobit model of effects. In the data from the study, we find no evidence against the Tobit model of effects.


62P25 Applications of statistics to social sciences
62H30 Classification and discrimination; cluster analysis (statistical aspects)


Full Text: DOI Euclid


[1] Aronow, P. M., Middleton, J. A. et al. (2013). A class of unbiased estimators of the average treatment effect in randomized experiments. Journal of Causal Inference1 135-154.
[2] Berger, R. L. and Boos, D. D. (1994). \(P\) values maximized over a confidence set for the nuisance parameter. J. Amer. Statist. Assoc.89 1012-1016. · Zbl 0804.62018
[3] Borman, G. D., Slavin, R. E., Cheung, A., Chamberlain, A. M., Madden, N. A. and Chambers, B. (2005). Success for all: First-year results from the national randomized field trial. Educ. Eval. Policy Anal.27 1-22.
[4] Braun, T. M. and Feng, Z. (2001). Optimal permutation tests for the analysis of group randomized trials. J. Amer. Statist. Assoc.96 1424-1432. · Zbl 1051.62042 · doi:10.1198/016214501753382336
[5] Chetty, R., Friedman, J. N. and Rockoff, J. E. (2014). Measuring the impacts of teachers I: Evaluating bias in teacher value-added estimates. Am. Econ. Rev.104 2593-2632.
[6] Cochran, W. G. (1977). Sampling Techniques, 3rd ed. Wiley, New York. · Zbl 0353.62011
[7] Cornfield, J. (1978). Randomization by group. Am. J. Epidemiol.108 100-102.
[8] Datta, S. and Satten, G. A. (2005). Rank-sum tests for clustered data. J. Amer. Statist. Assoc.100 908-915. · Zbl 1117.62313 · doi:10.1198/016214504000001583
[9] Ding, P. (2017). A paradox from randomization-based causal inference. Statist. Sci.32 331-345. · Zbl 1442.62014 · doi:10.1214/16-STS571
[10] Ding, P., Feller, A. and Miratrix, L. (2016). Randomization inference for treatment effect variation. J. R. Stat. Soc. Ser. B. Stat. Methodol.78 655-671. · Zbl 1414.62146
[11] Donner, A. and Klar, N. (2000). Design and Analysis of Cluster Randomization Trials in Health Research. Wiley, New York.
[12] Dutta, S. and Datta, S. (2016). A rank-sum test for clustered data when the number of subjects in a group within a cluster is informative. Biometrics72 432-440. · Zbl 1419.62342 · doi:10.1111/biom.12447
[13] Feder, G., Griffiths, C., Eldridge, S. and Spence, M. (1999). Effect of postal prompts to patients and general practitioners on the quality of primary care after a coronary event (POST): Randomised controlled trial. BMJ318 1522-1526.
[14] Fisher, R. A. (1935). The Design of Experiments. Oliver and Boyd, London.
[15] Gail, M. H., Byar, D. P., Pechacek, T. F., Corle, D. K., Group, C. S. et al. (1992). Aspects of statistical design for the community intervention trial for smoking cessation (COMMIT). Control. Clin. Trials13 6-21.
[16] Gail, M. H., Mark, S. D., Carroll, R. J. and Green, S. B. (1996). On design considerations and randomization-based inference for community intervention trials. Stat. Med.15 1069-1092.
[17] Hansen, B. B. and Bowers, J. (2009). Attributing effects to a cluster-randomized get-out-the-vote campaign. J. Amer. Statist. Assoc.104 873-885. · Zbl 1388.62185 · doi:10.1198/jasa.2009.ap06589
[18] Hayes, R. and Moulton, L. (2009). Cluster Randomised Trials. Chapman& Hall/CRC, London.
[19] Hedges, L. V. and Hedberg, E. C. (2007). Intraclass correlation values for planning group-randomized trials in education. Educ. Eval. Policy Anal.29 60-87.
[20] Hodges, J. L. Jr. and Lehmann, E. L. (1963). Estimates of location based on rank tests. Ann. Math. Stat.34 598-611. · Zbl 0203.21105 · doi:10.1214/aoms/1177704172
[21] Imbens, G. W. and Rubin, D. B. (2015). Causal Inference—For Statistics, Social, and Biomedical Sciences: An Introduction. Cambridge Univ. Press, New York. · Zbl 1355.62002
[22] Imbens, G. M. and Wooldridge, J. M. (2008). Recent developments in the econometrics of program evaluation. Journal of Economic Literature47 5-86.
[23] Li, X. and Ding, P. (2017). General forms of finite population central limit theorems with applications to causal inference. J. Amer. Statist. Assoc.112 1759-1769.
[24] Middleton, J. A. (2008). Bias of the regression estimator for experiments using clustered random assignment. Statist. Probab. Lett.78 2654-2659. · Zbl 1147.62329 · doi:10.1016/j.spl.2008.03.008
[25] Middleton, J. A. and Aronow, P. M. (2015). Unbiased estimation of the average treatment effect in cluster-randomized experiments. Statistics, Politics and Policy6 39-75.
[26] Murnane, R. J. and Willett, J. B. (2010). Methods Matter: Improving Causal Inference in Educational and Social Science Research. Oxford University Press, Oxford.
[27] Neyman, J. (1990). On the application of probability theory to agricultural experiments. Essay on principles. Section 9. Statist. Sci.5 465-472. Translated from the Polish and edited by D. M. Dąbrowska and T. P. Speed. · Zbl 0955.01560 · doi:10.1214/ss/1177012031
[28] Neyman, J. (1935). Statistical problems in agricultural experimentation. Suppl. J. R. Stat. Soc.2 107-180. · doi:10.2307/2983637
[29] Nolen, T. L. and Hudgens, M. G. (2011). Randomization-based inference within principal strata. J. Amer. Statist. Assoc.106 581-593. · Zbl 1232.62152 · doi:10.1198/jasa.2011.tm10356
[30] Rosenbaum, P. R. (2001). Effects attributable to treatment: Inference in experiments and observational studies with a discrete pivot. Biometrika88 219-231. · Zbl 1032.62107 · doi:10.1093/biomet/88.1.219
[31] Rosenbaum, P. R. (2002a). Observational Studies, 2nd ed. Springer, New York. · Zbl 0985.62091
[32] Rosenbaum, P. R. (2002b). Covariance adjustment in randomized experiments and observational studies. Statist. Sci.17 286-327. · Zbl 1013.62117 · doi:10.1214/ss/1042727942
[33] Rosenbaum, P. R. (2007). Confidence intervals for uncommon but dramatic responses to treatment. Biometrics63 1164-1171, 1313. · Zbl 1136.62397 · doi:10.1111/j.1541-0420.2007.00783.x
[34] Rosenbaum, P. R. (2010). Design of Observational Studies. Springer, New York. · Zbl 1308.62005
[35] Rosner, B., Glynn, R. J. and Lee, M.-L. T. (2003). Incorporation of clustering effects for the Wilcoxon rank sum test: A large-sample approach. Biometrics59 1089-1098. · Zbl 1274.62328 · doi:10.1111/j.0006-341X.2003.00125.x
[36] Rosner, B., Glynn, R. J. and Lee, M.-L. T. (2006). The Wilcoxon signed rank test for paired comparisons of clustered data. Biometrics62 185-192, 318. · Zbl 1091.62036 · doi:10.1111/j.1541-0420.2005.00389.x
[37] Rubin, D. B. (1974). Estimating causal effects of treatments in randomized and nonrandomized studies. J. Educ. Psychol.6 688-701.
[38] Rubin, D. B. (1986). Which ifs have causal answers. J. Amer. Statist. Assoc.81 961-962.
[39] Schochet, P. Z. (2013). Estimators for clustered education RCTs using the Neyman model for causal inference. J. Educ. Behav. Stat.38 219-238.
[40] Small, D. S., Have, T. R. T. and Rosenbaum, P. R. (2008). Randomization inference in a group-randomized trial of treatments for depression: Covariate adjustment, noncompliance, and quantile effects. J. Amer. Statist. Assoc.103 271-279. · Zbl 1471.62513 · doi:10.1198/016214507000000897
[41] Tobin, J. (1958). Estimation of relationships for limited dependent variables. Econometrica26 24-36. · Zbl 0088.36607 · doi:10.2307/1907382
[42] Williamson, J. M., Datta, S. and Satten, G. A. (2003). Marginal analyses of clustered data when cluster size is informative. Biometrics59 36-42. · Zbl 1210.62082 · doi:10.1111/1541-0420.00005
[43] Zhang, K., Traskin, M. and Small, D. S. (2012). A powerful and robust test statistic for randomization inference in group-randomized trials with matched pairs of groups. Biometrics68 75-84. · Zbl 1241.62172 · doi:10.1111/j.1541-0420.2011.01622.x
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.