×

Improving precision of ability estimation: getting more from response times. (English) Zbl 1460.62175

Summary: By considering information about response time (RT) in addition to response accuracy (RA), joint models for RA and RT such as the hierarchical model [W. J. van der Linden, Psychometrika 72, No. 3, 287–308 (2007; Zbl 1286.62112)] can improve the precision with which ability is estimated over models that only consider RA. The hierarchical model, however, assumes that only the person’s speed is informative of ability. This assumption of conditional independence between RT and ability given speed may be violated in practice, and ignores collateral information about ability that may be present in the residual RTs. We propose a posterior predictive check for evaluating the assumption of conditional independence between RT and ability given speed. Furthermore, we propose an extension of the hierarchical model that contains cross-loadings between ability and RT, which enables one to take additional collateral information about ability into account beyond what is possible in the standard hierarchical model. A Bayesian estimation procedure is proposed for the model. Using simulation studies, the performance of the model is evaluated in terms of parameter recovery, and the possible gain in precision over the standard hierarchical model and an RA only model is considered. The model is applied to data from a high-stakes educational test.

MSC:

62P15 Applications of statistics to psychology
62H12 Estimation in multivariate analysis

Citations:

Zbl 1286.62112

Software:

cirt
PDFBibTeX XMLCite
Full Text: DOI

References:

[1] Adams, R. (2005). Reliability as a measurement design effect. Studies in Educational Evaluation, 31, 162-172. https://doi.org/10.1016/j.stueduc.2005.05.008
[2] Albert, J. (1992). Bayesian estimation of normal ogive item response curves using Gibbs sampling. Journal of Educational Statistics, 17, 251-269. https://doi.org/10.2307/1165149
[3] Bolsinova, M., de Boeck, P., & Tijmstra, J. (2017). Modelling conditional dependence between response time and accuracy. Psychometrika, 82, 1126-1148. https://doi.org/10.1007/s11336-016-9537-6. · Zbl 1402.62300
[4] Bolsinova, M., & Maris, G. (2016). A test for conditional independence between response time and accuracy. British Journal of Mathematical and Statistical Psychology, 69, 62-79. https://doi.org/10.1111/bmsp.12059 · Zbl 1406.91341
[5] Bolsinova, M., & Tijmstra, J. (2016). Posterior predictive checks for conditional indepen‐dence between response time and accuracy. Journal of Educational and Behavioral Statistics, 41, 123-145. https://doi.org/10.3102/1076998616631746
[6] Bolsinova, M., Tijmstra, J., & Molenaar, D. (2016). Response moderation models for conditional dependence between response time and response accuracy. British Journal of Mathematical and Statistical Psychology, 70, 257-279. https://doi.org/10.1111/bmsp.12076 · Zbl 1406.91342
[7] Casella, G., & George, E. (1992). Explaining the Gibbs sampler. American Statistician, 46, 167-174. https://doi.org/10.2307/2685208
[8] Fox, J.‐P., Klein Entink, R., & van der Linden, W. (2007). Modeling of responses and response times with the package cirt. Journal of Statistical Software, 20(7), 1-14. https://doi.org/10.18637/jss.v020.i07
[9] Gelman, A., Meng, X., & Stern, H. (1996). Posterior predictive assessment of model fitness via realized discrepancies. Statistica Sinica, 4, 733-807. https://doi.org/10.1.1.142.9951 · Zbl 0859.62028
[10] Geman, S., & Geman, D. (1984). Stochastic relaxation, Gibbs distributions, and the Bayesian restoration of images. IEEE Transactions on Pattern Analysis and Machine Intelligence, 6, 721-741. https://doi.org/10.1109/TPAMI.1984.4767596 · Zbl 0573.62030
[11] Geweke, J. (1992). Evaluating the accuracy of sampling‐based approaches to the calculation of posterior moments. In J. M. Bernardo, A. P. Berger, A. P. Dawid, & A. F. M. Smith (Eds.), Bayesian statistics 4 (pp. 169-193). Oxford, UK: Oxford University Press.
[12] Gitomer, D. H., Curtis, M. E., Glaser, R., & Lensky, D. B. (1987). Processing differences as a function of item difficulty in verbal analogy performance. Journal of Educational Psychology, 79, 212-219. https://doi.org/10.1037/0022-0663.79.3.212
[13] Hoff, P. D. (2009). A first course in Bayesian statistical methods. New York, NY: Spinger. https://doi.org/10.1007/978-0-387-92407-6 · Zbl 1213.62044
[14] Klein Entink, R., Fox, J., & van der Linden, W. J. (2009). A multivariate multilevel approach to the modeling of accuracy and speed of test takers. Psychometrika, 74, 21-48. https://doi.org/10.1007/s11336-008-9075-y · Zbl 1263.62138
[15] Lord, F., & Novick, M. (1968). Statistical theories of mental test scores. Reading, MA: Addison‐Wesley. · Zbl 0186.53701
[16] Marianti, S., Fox, J.‐P., Avetisyan, M., Veldkamp, B. P., & Tijmstra, J. (2014). Testing for aberrant behavior in response time modeling. Journal of Educational and Behavioral Statistics, 39, 426-451. https://doi.org/10.3102/1076998614559412
[17] Meng, X.‐L. (1994). Posterior predictive p‐values. Annals of Statistics, 22(3), 1142-1160. https://doi.org/10.1214/aos/1176325622 · Zbl 0820.62027
[18] Meng, X. B., Tao, J., & Chang, H. H. (2015). A conditional joint modeling approach for locally dependent item responses and response times. Journal of Educational Measurement, 52, 1-27. https://doi.org/10.1111/jedm.12060
[19] Millsap, R. (2011). Statistical approaches to measurement invariance. New York, NY: Routledge.
[20] Molenaar, D., Tuerlinckx, F., & van der Maas, H. (2015). A bivariate generalized linear item response theory modeling framework to the analysis of responses and response times. Multivariate Behavioral Research, 50, 56-74. https://doi.org/10.1080/00273171.2014.962684 · Zbl 1406.91372
[21] Ranger, J. (2013). A note on the hierarchical model for responses and response times in tests of van der Linden (2007). Psychometrika, 78, 538-544. https://doi.org/10.1007/s11336-013-9324-6 · Zbl 1284.62739
[22] Ranger, J., & Ortner, T. (2012). The case of dependency of responses and response times: Modeling approach based on standard latent trait models. Psychological Test and Assessment Modeling, 54, 128-148.
[23] Rubin, D. (1984). Bayesianly justifiable and relevant frequency calculations for the applied statistician. Annals of Statistics, 12, 1151-1172. https://doi.org/10.1214/aos/1176346785 · Zbl 0555.62010
[24] Schaeffer, G. A., Reese, C. M., Steffen, M., McKinley, R. L., & Mills, C. N. (1993). Field test of a computer‐based GRE general test (ETS Research report No. 93‐07). Princeton, NJ: Educational Testing Service.
[25] Sinharay, S. (2005). Assessing fit of unidimensional item response theory models using a Bayesian approach. Journal of Educational Measurement, 42, 375-394. https://doi.org/10.1111/j.1745-3984.2005.00021.x
[26] Sinharay, S., Johnson, M. S., & Stern, H. S. (2006). Posterior predictive assessment of item response theory models. Applied Psychological Measurement, 30, 298-321. https://doi.org/10.1177/0146621605285517
[27] Spiegelhalter, D. J., Best, N. G., Carlin, B. P., & van der Linde, A. (2002). Bayesian measures of model complexity and fit. Journal of the Royal Statistical Society, Series B, 64, 583-639. https://doi.org/10.1111/1467-9868.00353 · Zbl 1067.62010
[28] Tanner, M., & Wong, W. (1987). The calculation of posterior distributions by data aug‐mentation. Journal of the American Statistical Association, 82, 528-540. https://doi.org/10.2307/2289457 · Zbl 0619.62029
[29] van der Linden, W. J. (2006). A lognormal model for response times on test items. Journal of Educational and Behavioral Statistics, 31, 181-204. https://doi.org/10.3102/10769986031002181
[30] van der Linden, W. J. (2007). A hierarchical framework for modeling speed and accuracy on test items. Psychometrika, 72, 287-308. https://doi.org/10.1007/s11336-006-1478-z · Zbl 1286.62112
[31] van der Linden, W. J. (2008). Using response times for item selection in adaptive testing. Journal of Educational and Behavioral Statistics, 33(1), 5-20. https://doi.org/10.3102/1076998607302626
[32] van der Linden, W. J., & Glas, C. A. W. (2010). Statistical tests of conditional independence between responses and/or response times on test items. Psychometrika, 75(1), 120-139. https://doi.org/10.1007/s11336-009-9129-9 · Zbl 1272.62140
[33] van der Linden, W. J., & Guo, F. (2008). Bayesian procedures for identifying aberrant response‐time patterns in adaptive testing. Psychometrika, 73, 365-384. https://doi.org/10.1007/s11336-007-9046-8 · Zbl 1301.62126
[34] van der Linden, W. J., Klein Entink, R. H., & Fox, J. P. (2010). IRT parameter estimation with response times as collateral information. Applied Psychological Measurement, 34, 327-347. https://doi.org/10.1177/0146621609349800
[35] van der Linden, W. J., Scrams, D., & Schnipke, D. (1999
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.