Exit polling and racial bloc voting: combining individual-level and R\(\times \)C ecological data. (English) Zbl 1220.62159

Summary: Despite its shortcomings, cross-level or ecological inference remains a necessary part of some areas of quantitative inference, including in United States voting rights litigation. Ecological inference suffers from a lack of identification that, most agree, is best addressed by incorporating individual-level data into the model. We test the limits of such an incorporation by attempting it in the context of drawing inferences about racial voting patterns using a combination of an exit poll and precinct-level ecological data; accurate information about racial voting patterns is needed to assess triggers in voting rights laws that can determine the composition of United States legislative bodies. Specifically, we extend and study a hybrid model that addresses two-way tables of arbitrary dimension. We apply the hybrid model to an exit poll we administered in the City of Boston in 2008. Using the resulting data as well as simulation, we compare the performance of a pure ecological estimator, pure survey estimators using various sampling schemes and our hybrid. We conclude that the hybrid estimator offers substantial benefits by enabling substantive inferences about voting patterns not practicably available without its use.


62P25 Applications of statistics to social sciences
62F15 Bayesian inference
62D05 Sampling theory, sample surveys
62H17 Contingency tables
65C60 Computational problems in statistics (MSC2010)
Full Text: DOI arXiv


[1] th Cir. (1984). United States v Marengo County Commission. Federal Reporter, Second Series 731 1546.
[2] Achen, C. H. and Shively, W. P. (1995). Cross-Level Inference . Univ. Chicago Press, Chicago.
[3] Aitchison, J. (2003). The Statistical Analysis of Compositional Data , 2nd ed. The Blackburn Press, Caldwell, NJ. · Zbl 0491.62017
[4] Belin, T. R., Diffendal, G. J., Mack, S., Rubin, D. B., Schafer, J. L. and Zaslavsky, A. M. (1993). Hierarchical logistic regression models for imputation of unresolved enumeration status in undercount estimation. J. Amer. Statist. Assoc. 88 1149-1159.
[5] Bishop, Y. M., Fienberg, S. E. and Holland, P. W. (1975). Discrete Multivariate Analysis . MIT Press, Cambridge, MA. · Zbl 0332.62039
[6] Bishop, G. F. and Fisher, B. S. (1995). Secret ballots and self-reports in an exit-poll experiment. Public Opinion Quarterly 59 568-588.
[7] Brown, P. J. and Payne, C. D. (1986). Aggregate data, ecological regression, and voting transitions. J. Amer. Statist. Assoc. 81 452-460. · Zbl 0604.62119
[8] Deming, W. and Stephan, F. (1940). On a least squares adjustment of a sampled frequency table when the expected marginal totals are known. Ann. Math. Statist. 11 427-444. · Zbl 0024.05502
[9] Deville, J.-C., Sarndal, C.-E. and Sautory, O. (1993). Generalized raking procedures in survey sampling. J. Amer. Statist. Assoc. 88 1013-1020. · Zbl 0794.62005
[10] Duncan, O. D. and Davis, B. (1953). An alternative to ecological correlation. American Sociological Review 18 665-666.
[11] Gelman, A., Ansolabehere, S., Price, P. N., Park, D. K. and Minnite, L. C. (2001). Models, assumptions, and model checking in ecological regressions. J. Roy. Statist. Soc. Ser. A Part 1 164 101-118. · Zbl 1002.62526
[12] Glynn, A. N., Wakefield, J., Handcock, M. S. and Richardson, T. S. (2008). Alleviating linear ecological bias and optimal design with subsample data. J. Roy. Statist. Soc. Ser. A 171 179-202.
[13] Glynn, A. N., Wakefield, J., Handcock, M. S. and Richardson, T. S. (2009). Alleviating ecological bias in generalized linear models with optimal subsample design. On file with authors.
[14] Greiner, D. J. (2007). Ecological inference in voting rights act disputes. Jurimetrics Journal 47 115-167.
[15] Greiner, D. J. and Quinn, K. M. (2009). R\times C ecological inference: Bounds, correlations, flexibility, and transparency of assumptions. J. Roy. Statist. Soc. Ser. A 172 67-81. · Zbl 05622741
[16] Greiner, D. J. and Quinn, K. M. (2010). Supplement to “Exit polling and racial bloc voting: Combining individual-level and R\times C ecological data.” DOI: , DOI: 10.1214/10-AOAS353SUPPB . · Zbl 1220.62159
[17] Haneuse, S. J. and Wakefield, J. C. (2008). The combination of ecological and case-control data. J. Roy. Statist. Soc. Ser. B 70 73-93. · Zbl 1400.62305
[18] Honaker, J. and King, G. (2009). What to do about missing values in times series cross-section data. Working paper, Harvard Univ.
[19] Hopkins, D. J. (2008). No more wilder effect, never a Whitman effect: When and why polls mislead about black and female candidates. Available at .
[20] Issacharoff, S. (1992). Polarized voting and the political process: The transformation of voting rights jurisprudence. Michigan Law Review 93 1833.
[21] Judge, G., Miller, D. J. and Cho, W. K. T. (2004). An information theoretic approach to ecological estimation and inference. In Ecological Inference: New Methodological Strategies (G. King, O. Rosen and M. A. Tanner, eds.). Cambridge Univ. Press, Cambridge.
[22] King, G. (1997). A Solution to the Ecological Inference Problem . Princeton Univ. Press.
[23] Little, R. J. (1993). Post-stratification: A modeler’s perspective. J. Amer. Statist. Assoc. 88 1001-1012. · Zbl 0785.62011
[24] Little, R. J. A. and Rubin, D. B. (2002). Statistical Analysis with Missing Data , 2nd ed. Wiley, Hoboken, NJ. · Zbl 1011.62004
[25] Lublin, D. I. (1995). Classifying by race. In Race, Represenation, and Redistricting 111-128. Princeton Univ. Press.
[26] McCullagh, P. and Nelder, J. (1989). Generalized Linear Models , 2nd ed. Chapman and Hall/CRC, London. · Zbl 0744.62098
[27] Metropolis, N., Rosenbluth, A., Rosenbluth, M., Teller, A. and Teller, E. (1953). Equations of state calculations by fast computing machines. J. Chem. Phys. 21 1087-1092.
[28] Raghunathan, T. E., Diehr, P. K. and Cheadle, A. D. (2003). Combining aggregate and individual level data to estimate an individual level correlation coefficient. Journal of Education and Behavioral Statistics 28 1-19.
[29] Rivers, D. (1998). Review of “A solution to the ecological inference problem.” The American Political Science Review 92 442-443.
[30] Robinson, W. (1950). Ecological correlations and the behavior of individuals. American Sociological Review 15 351-357.
[31] Rosen, O., Jiang, W., King, G. and Tanner, M. A. (2001). Bayesian and frequentist inference for ecological inference: The R\times C case. Statist. Neerlandica 55 134-156. · Zbl 1075.62653
[32] Salway, R. and Wakefield, J. (2005). Sources of bias in ecological studies of non-rare events. Environ. Ecol. Stat. 12 321-347.
[33] Silver, B. D., Anderson, B. and Abramson, P. R. (1986). Who overreports voting. American Political Science Review 80 613-624.
[34] Steel, D., Tranmer, M. and Holt, D. (2003). Analysis of survey data. In Analysis Combining Survey and Geogrphically Aggregated Data 323-343. Wiley, Chichester. · Zbl 1105.92048
[35] Tanner, M. A. and Wong, W. H. (1987). The calculation of posterior distributions by data augmentation. J. Amer. Statist. Assoc. 82 528-540. · Zbl 0619.62029
[36] Wakefield, J. (2004). Ecological inference for 2\times 2 tables. J. Roy. Statist. Soc. Ser. A 167 385-445.
[37] Zaslavsky, A. M. (1993). Combining census, dual-system, and evaluation study data to estimate population shares. J. Amer. Statist. Assoc. 88 1092-1105.
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. It attempts to reflect the references listed in the original paper as accurately as possible without claiming the completeness or perfect precision of the matching.