A simulation-based framework for assessing the feasibility of respondent-driven sampling for estimating characteristics in populations of lesbian, gay and bisexual older adults. (English) Zbl 1412.62210

Summary: Respondent-driven sampling (RDS) is a method for sampling from a target population by leveraging social connections. RDS is invaluable to the study of hard-to-reach populations. However, RDS is costly and can be infeasible. RDS is infeasible when RDS point estimators have small effective sample sizes (large design effects) or when RDS interval estimators have large confidence intervals relative to estimates obtained in previous studies or poor coverage. As a result, researchers need tools to assess whether or not estimation of certain characteristics of interest for specific populations is feasible in advance. In this paper, we develop a simulation-based framework for using pilot data – in the form of a convenience sample of aggregated, egocentric data and estimates of subpopulation sizes within the target population – to assess whether or not RDS is feasible for estimating characteristics of a target population. In doing so, we assume that more is known about egos than alters in the pilot data, which is often the case with aggregated, egocentric data in practice. We build on existing methods for estimating the structure of social networks from aggregated, egocentric sample data and estimates of subpopulation sizes within the target population. We apply this framework to assess the feasibility of estimating the proportion male, proportion bisexual, proportion depressed and proportion infected with HIV/AIDS within three spatially distinct target populations of older lesbian, gay and bisexual adults using pilot data from the Caring and Aging with Pride Study and the Gallup Daily Tracking Survey. We conclude that using an RDS sample of 300 subjects is infeasible for estimating the proportion male, but feasible for estimating the proportion bisexual, proportion depressed and proportion infected with HIV/AIDS in all three target populations.


62P25 Applications of statistics to social sciences
62D05 Sampling theory, sample surveys
91D30 Social networks; opinion dynamics


R; igraph; statnet; ergm
Full Text: DOI Euclid


[1] Admiraal, R. and Handcock, M. S. (2016). Modeling concurrency and selective mixing in heterosexual partnership networks with applications to sexually transmitted diseases. Ann. Appl. Stat.10 2021–2046. · Zbl 1454.62307
[2] Andresen, E. M., Malmgren, J. A., Carter, W. B. and Patrick, D. L. (1994). Screening for depression in well older adults: Evaluation of a short form of the CES-D (Center for Epidemiologic Studies Depression Scale). Am. J. Prev. Med.10 77–84.
[3] Barash, V. D., Cameron, C. J., Spiller, M. W. and Heckathorn, D. D. (2016). Respondent-driven sampling—Testing assumptions: Sampling with replacement. J. Off. Stat.32 29–73.
[4] Cohen, J. (1988). Statistical Power Analysis for the Behavioral Sciences, 2nd ed. Routledge, Hillsdale, NJ. · Zbl 0747.62110
[5] Cornwell, B., Laumann, E. O. and Schumm, L. P. (2008). The social connectedness of older adults: A national profile. Am. Sociol. Rev.73 185–203.
[6] Crawford, F. W., Aronow, P. M., Zeng, L. and Li, J. (2018). Identification of homophily and preferential recruitment in respondent-driven sampling. Am. J. Epidemiol.187 153–160.
[7] Erosheva, E. A., Kim, H.-J., Emlet, C. and Fredriksen-Goldsen, K. I. (2016). Social networks of lesbian, gay, bisexual, and transgender older adults. Research on Aging38 98–123.
[8] Fredriksen-Goldsen, K. I. and Muraco, A. (2010). Aging and sexual orientation: A 25-year review of the literature. Research on Aging32 372–413.
[9] Fredriksen-Goldsen, K. I., Emlet, C. A., Kim, H.-J., Muraco, A., Erosheva, E. A., Goldsen, J. and Hoy-Ellis, C. P. (2013). The physical and mental health of lesbian, gay male, and bisexual (LGB) older adults: The role of key health indicators and risk and protective factors. The Gerontologist53 664–675.
[10] Fredriksen-Goldsen, K. I., Kim, H.-J., Shiu, C., Goldsen, J. and Emlet, C. A. (2015). Successful aging among LGBT older adults: Physical and mental health-related quality of life by age group. The Gerontologist55 154–168.
[11] Gabor, C. and Nepusz, T. (2006). The igraph software package for complex network research. InterJournal, Complex Systems1695 1–9.
[12] Gile, K. J. (2011). Improved inference for respondent-driven sampling data with application to HIV prevalence estimation. J. Amer. Statist. Assoc.106 135–146. · Zbl 1396.62009
[13] Gile, K. J. and Handcock, M. S. (2010). Respondent-driven sampling: An assessment of current methodology. Sociol. Method.40 285–327.
[14] Gile, K. J. and Handcock, M. S. (2015). Network model-assisted inference from respondent-driven sampling data. J. Roy. Statist. Soc. Ser. A178 619–639.
[15] Gile, K. J., Johnston, L. G. and Salganik, M. J. (2015). Diagnostics for respondent-driven sampling. J. Roy. Statist. Soc. Ser. A178 241–269.
[16] Gneiting, T., Balabdaoui, F. and Raftery, A. E. (2007). Probabilistic forecasts, calibration and sharpness. J. R. Stat. Soc. Ser. B. Stat. Methodol.69 243–268. · Zbl 1120.62074
[17] Goel, S. and Salganik, M. J. (2010). Assessing respondent-driven sampling. Proc. Natl. Acad. Sci. USA107 6743–6747.
[18] Griffin, M., Gile, K. J., Fredricksen-Goldsen, K. I., Handcock, M. S. and Erosheva, E. A. (2018). Supplement to “A simulation-based framework for assessing the feasibility of respondent-driven sampling for estimating characteristics in populations of lesbian, gay and bisexual older adults.” DOI:10.1214/18-AOAS1151SUPP.
[19] Handcock, M. S., Hunter, D. R., Butts, C. T., Goodreau, S. M. and Morris, M. (2008). ergm: A package to fit, simulate and diagnose exponential-family models for networks. J. Stat. Softw.24 1–29.
[20] Heckathorn, D. D. (1997). Respondent-driven sampling: A new approach to the study of hidden populations. Soc. Probl.44 174–199.
[21] Johnston, L. G., Whitehead, S., Simic-Lawson, M. and Kendall, C. (2010). Formative research to optimize respondent-driven sampling surveys among hard-to-reach populations in HIV behavioral and biological surveillance: Lessons learned from four case studies. AIDS Care22 784–792.
[22] Kogan, S. M., Wejnert, C., Chen, Y.-F., Brody, G. H. and Slater, L. M. (2011). Respondent-driven sampling with hard-to-reach emerging adults: An introduction and case study with rural African americans. Journal of Adolescent Research26 30–60.
[23] Li, X. and Rohe, K. (2017). Central limit theorems for network driven sampling. Electron. J. Stat.11 4871–4895. · Zbl 1386.60144
[24] Lohr, S. L. (2010). Sampling: Design and Analysis, 2nd ed. Brooks/Cole, Cengage Learning, Boston, MA. · Zbl 1273.62010
[25] Lu, X., Bengtsson, L., Britton, T., Camitz, M., Kim, B. J., Thorson, A. and Liljeros, F. (2012). The sensitivity of respondent-driven sampling. J. Roy. Statist. Soc. Ser. A175 191–216.
[26] Malekinejad, M., Johnston, L. G., Kendall, C., Kerr, L. R. F. S., Rifkin, M. R. and Rutherford, G. W. (2008). Using respondent-driven sampling methodology for HIV biological and behavioral surveillance in international settings: A systematic review. AIDS and Behavior12 105–130.
[27] McCreesh, N., Frost, S. D. W., Seeley, J., Katongole, J., Tarsh, M. N., Ndunguse, R., Jichi, F., Lunel, N. L., Maher, D., Johnston, L. G., Sonnenberg, P., Copas, A. J., Hayes, R. J. and White, R. G. (2012). Evaluation of respondent-driven sampling. Epidemiology23 138–147.
[28] Merli, M. G., Moody, J., Smith, J., Li, J., Weir, S. and Chen, X. (2015). Challenges to recruiting population representative samples of female sex workers in China using respondent driven sampling. Social Science and Medicine125 79–93.
[29] R Core Team (2013). R: A Language and Environment for Statistical Computing.
[30] Radloff, L. S. (1977). The CES-D scale: A self-report depression scale for research in the general population. Applied Psychological Measurement1 385–401.
[31] Rohe, K. (2015). Network driven sampling; a critical threshold for design effects. Preprint. Available at arXiv:1505.05461.
[32] Salganik, M. J. (2006). Variance estimation, design effects, and sample size calculations for respondent-driven sampling. Journal of Urban Health83 98–112.
[33] Salganik, M. J. and Heckathorn, D. D. (2004). Sampling and estimation in hidden populations using respondent-driven sampling. Sociol. Method.34 193–239.
[34] Tomas, A. and Gile, K. J. (2011). The effect of differential recruitment, non-response and non-recruitment on estimators for respondent-driven sampling. Electron. J. Stat.5 899–934. · Zbl 1274.62103
[35] Verdery, A. M., Mouw, T., Bauldry, S. and Mucha, P. J. (2015). Network structure and biased variance estimation in respondent driven sampling. PLoS ONE10 1–27.
[36] Volz, E. and Heckathorn, D. D. (2008). Probability based estimation theory for respondent driven sampling. J. Off. Stat.24 79–97.
[37] Wejnert, C., Pham, H., Krishna, N., Le, B. and DiNenno, E. (2012). Estimating design effect and calculating sample size for respondent-driven sampling studies of injection drug users in the United States. AIDS and Behavior16 797–806.
[38] Zea, M. C. (2010). Reaction to the special issue on centralizing the experiences of LGB people of color in counseling psychology. The Counseling Psychologist38 425–433.
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. It attempts to reflect the references listed in the original paper as accurately as possible without claiming the completeness or perfect precision of the matching.