×

Bayesian hierarchical rule modeling for predicting medical conditions. (English) Zbl 1243.62036

Summary: We propose a statistical modeling technique, called the Hierarchical Association Rule Model (HARM), that predicts a patient’s possible future medical conditions given the patient’s current and past history of reported conditions. The core of our technique is a Bayesian hierarchical model for selecting predictive association rules (such as condition 1 and condition \(2 \rightarrow \) condition 3) from a large set of candidate rules. Because this method “borrows strength” using the conditions of many similar patients, it is able to provide predictions specialized to any given patient, even when little information about the patient’s history of conditions is available.

MSC:

62F15 Bayesian inference
62P10 Applications of statistics to biology and medical sciences; meta analysis
92C50 Medical applications (general)
68T05 Learning and adaptive systems in artificial intelligence
PDFBibTeX XMLCite
Full Text: DOI arXiv Euclid

References:

[1] Agarwal, D., Zhang, L. and Mazumder, R. (2012). Modeling item-item similarities for personalized recommendations on Yahoo! front page. Ann. Appl. Stat. · Zbl 1231.62207 · doi:10.1214/11-AOAS475
[2] Agrawal, R., Imieliński, T. and Swami, A. (1993). Mining association rules between sets of items in large databases. In Proceedings of the ACM SIGMOD International Conference on Management of Data 207-216. ACM, New York, NY, USA.
[3] Berchtold, A. and Raftery, A. E. (2002). The mixture transition distribution model for high-order Markov chains and non-Gaussian time series. Statist. Sci. 17 328-356. · Zbl 1013.62088 · doi:10.1214/ss/1042727943
[4] Breese, J. S., Heckerman, D. and Kadie, C. (1998). Empirical analysis of predictive algorithms for collaborative filtering. In Proceedings of the Fourteenth Conference on Uncertainty and Artificial Intelligence 43-52. Morgan Kaufmann, San Francisco, CA.
[5] Condliff, M. K., Lewis, D. D., Madigan, D. and Posse, C. (1999). Bayesian mixed-effects models for recommender systems. In Proceedings of the ACM SIGIR Workshop on Recommender Systems : Algorithms and Evaluation 23-30. ACM Press, New York.
[6] Davis, D. A., Chawla, N. V., Christakis, N. A. and Barabási, A.-L. (2010). Time to CARE: A collaborative engine for practical disease prediction. Data Min. Knowl. Discov. 20 388-415. · doi:10.1007/s10618-009-0156-z
[7] DuMouchel, W. and Pregibon, D. (2001). Empirical Bayes screening for multi-item associations. In Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining 67-76. ACM Press, New York.
[8] Fraley, C. and Raftery, A. E. (2002). Model-based clustering, discriminant analysis, and density estimation. J. Amer. Statist. Assoc. 97 611-631. · Zbl 1073.62545 · doi:10.1198/016214502760047131
[9] Geng, L. and Hamilton, H. J. (2007). Choosing the right lens: Finding what is interesting in data mining. In Quality Measures in Data Mining 3-24. Springer, Berlin.
[10] Gopalakrishnan, V., Lustgarten, J. L., Visweswaran, S. and Cooper, G. F. (2010). Bayesian rule learning for biomedical data mining. Bioinformatics 26 668-675.
[11] Hood, L. and Friend, S. H. (2011). Predictive, personalized, preventive, participatory (P4) cancer medicine. Nat. Rev. Clin. Oncol. 8 184-187.
[12] Kukline, E., Yoon, P. W. and Keenan, N. L. (2010). Prevalence of coronary heart disease risk factors and screening for high cholesterol levels among young adults in the United States, 1999-2006. Annals of Family Medicine 8 327-333.
[13] Letham, B., Rudin, C. and Madigan, D. (2011). Sequential event prediction. Working Paper OR 387-11, MIT Operations Research Center. · Zbl 1300.68043
[14] McCormick, T., Rudin, C. and Madigan, D. (2011). Supplement to “Bayesian hierarchical rule modeling for predicting medical conditions.” . · Zbl 1243.62036 · doi:10.1214/11-AOAS522
[15] Piatetsky-Shapiro, G. (1991). Discovery, analysis and presentation of strong rules. In Knowledge Discovery in Databases (G. Piatetsky-Shapiro and W. J. Frawley, eds.) 229-248. AAAI/MIT Press. · Zbl 0825.68361
[16] Rosamond, W., Flegal, K., Friday, G., Furie, K., Go, A., Greenlund, K., Haase, N., Ho, M., Howard, V., Kissela, B., Kittner, S., Lloyd-Jones, D., McDermott, M., Meigs, J., Moy, C., Nichol, G., O’Donnell, C. J., Roger, V., Rumsfeld, J., Sorlie, P., Steinberger, J., Thom, T., Wasserthiel-Smoller, S. and Hong, Y. (2007). Heart disease and stroke statistics-2007 update: A report from the American heart association statistics committee and stroke statistics subcommittee. Circulation 115 e69-e171.
[17] Rudin, C., Letham, B., Kogan, E. and Madigan, D. (2011a). A learning theory framework for association rules and sequential events. SSRN ELibrary . · Zbl 1317.68184
[18] Rudin, C., Letham, B., Salleb-Aouissi, A., Kogan, E. and Madigan, D. (2011b). Sequential event prediction with association rules. In Proceedings of the 24 th Annual Conference on Learning Theory ( COLT ).
[19] Shmueli, G. (2010). To explain or to predict? Statist. Sci. 25 289-310. · Zbl 1329.62045 · doi:10.1214/10-STS330
[20] Tan, P. N., Kumar, V. and Srivastava, J. (2002). Selecting the right interestingness measure for association patterns. In Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining . ACM Press, New York.
[21] Vogenberg, F. R. (2009). Predictive and prognostic models: Implications for healthcare decision-making in a modern recession. American Health and Drug Benefits 6 218-222.
[22] Willey, J. Z., Rodriguez, C. J., Carlino, R. F., Moon, Y. P., Paik, M. C., Boden-Albala, B., Sacco, R. L., DiTullio, M. R., Homma, S. and Elkind, M. S. V. (2011). Race-ethnic differences in the association between lipid profile components and risk of myocardial infarction: The Northern Manhattan Study. Am. Heart J. 161 886-892.
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.