×

On cognitive preferences and the plausibility of rule-based models. (English) Zbl 07224986

Summary: It is conventional wisdom in machine learning and data mining that logical models such as rule sets are more interpretable than other models, and that among such rule-based models, simpler models are more interpretable than more complex ones. In this position paper, we question this latter assumption by focusing on one particular aspect of interpretability, namely the plausibility of models. Roughly speaking, we equate the plausibility of a model with the likeliness that a user accepts it as an explanation for a prediction. In particular, we argue that – all other things being equal – longer explanations may be more convincing than shorter ones, and that the predominant bias for shorter models, which is typically necessary for learning powerful discriminative models, may not be suitable when it comes to user acceptance of the learned models. To that end, we first recapitulate evidence for and against this postulate, and then report the results of an evaluation in a crowdsourcing study based on about 3000 judgments. The results do not reveal a strong preference for simple rules, whereas we can observe a weak preference for longer rules in some domains. We then relate these results to well-known cognitive biases such as the conjunction fallacy, the representative heuristic, or the recognition heuristic, and investigate their relation to rule length and plausibility.

MSC:

68T05 Learning and adaptive systems in artificial intelligence
PDF BibTeX XML Cite
Full Text: DOI arXiv

References:

[1] Agrawal, R., Imielinski, T., & Swami, A. N. (1993). Mining association rules between sets of items in large databases. In Proceedings of the 1993 ACM SIGMOD International Conference on Management of Data (SIGMOD-93) (pp. 207-216), Washington, DC.
[2] Allahyari, H.; Lavesson, N.; Kofod-Petersen, A.; Heintz, F.; Langseth, H., User-oriented assessment of classification model understandability, Proceedings of the 11th Scandinavian conference on artificial intelligence (SCAI-11), 11-19 (2011), Trondheim: IOS Press, Trondheim
[3] Alonso, J. M.; Castiello, C.; Mencar, C.; Kacprzyk, J.; Pedrycz, W., Interpretability of fuzzy systems: Current research trends and prospects, Springer handbook of computational intelligence, 219-237 (2015), Berlin: Springer, Berlin
[4] Andrews, R.; Diederich, J.; Tickle, AB, Survey and critique of techniques for extracting rules from trained artificial neural networks, Knowledge-Based Systems, 8, 6, 373-389 (1995)
[5] Bar-Hillel, M., The base-rate fallacy in probability judgments, Acta Psychologica, 44, 3, 211-233 (1980)
[6] Bar-Hillel, M.; Neter, E., How alike is it versus how likely is it: A disjunction fallacy in probability judgments, Journal of Personality and Social Psychology, 65, 6, 1119-1131 (1993)
[7] Baron, J.; Beattie, J.; Hershey, JC, Heuristics and biases in diagnostic reasoning: II. Congruence, information, and certainty, Organizational Behavior and Human Decision Processes, 42, 1, 88-110 (1988)
[8] Bensusan, H. (1998). God doesn’t always shave with Occam’s Razor—Learning when and how to prune. In Nédellec, C. & Rouveirol, C. (Eds.), Proceedings of the 10th European conference on machine learning (ECML-98) (pp. 119-124).
[9] Besold, TR; d’Avila Garcez, AS; Stenning, K.; van der Torre, LWN; van Lambalgen, M., Reasoning in non-probabilistic uncertainty: Logic programming and neural-symbolic computing as examples, Minds and Machines, 27, 1, 37-77 (2017)
[10] Bibal, A. & Frénay, B. , (2016). Interpretability of machine learning models and representations: An introduction. In Proceedings of the 24th European symposium on artificial neural networks (ESANN) (pp. 77-82).
[11] Blei, DM, Probabilistic topic models, Communications of the ACM, 55, 4, 77-84 (2012)
[12] Blumer, A.; Ehrenfeucht, A.; Haussler, D.; Warmuth, MK, Occam’s razor, Information Processing Letters, 24, 377-380 (1987) · Zbl 0653.68084
[13] Bringsjord, S., Psychometric artificial intelligence, Journal of Experimental and Theoretical Artificial Intelligence, 23, 3, 271-277 (2011)
[14] Camerer, C.; Weber, M., Recent developments in modeling preferences: Uncertainty and ambiguity, Journal of Risk and Uncertainty, 5, 4, 325-370 (1992) · Zbl 0775.90102
[15] Cano, A.; Zafra, A.; Ventura, S., An interpretable classification rule mining algorithm, Information Sciences, 240, 1-20 (2013)
[16] Chaney, A. J., & Blei, D. M. (2012). Visualizing topic models. In Proceedings of the 6th international conference on weblogs and social media (ICWSM-12). Palo Alto: AAAI Press.
[17] Chew, SH; Ebstein, RP; Zhong, S., Ambiguity aversion and familiarity bias: Evidence from behavioral and gene association studies, Journal of Risk and Uncertainty, 44, 1, 1-18 (2012)
[18] Clark, WA; Avery, KL, The effects of data aggregation in statistical analysis, Geographical Analysis, 8, 4, 428-438 (1976)
[19] Cohen, WW; Prieditis, A.; Russell, S., Fast effective rule induction, Proceedings of the 12th international conference on machine learning (ML-95), 115-123 (1995), Lake Tahoe, CA: Morgan Kaufmann, Lake Tahoe, CA
[20] Craven, M.; Shavlik, JW, Using neural networks for data mining, Future Generation Computing Systems, 13, 2-3, 211-229 (1997)
[21] Crump, MJ; McDonnell, JV; Gureckis, TM, Evaluating Amazon’s mechanical Turk as a tool for experimental behavioral research, PloS One, 8, 3, e57410 (2013)
[22] Dempster, AP, Upper and lower probabilities induced by a multivalued mapping, The Annals of Mathematical Statistics, 38, 2, 325-339 (1967) · Zbl 0168.17501
[23] Dhurandhar, A.; Hüllermeier, E.; Kestler, H.; Wilhelm, A., How interpretable are you? a framework for quantifying interpretability, Book of Abstracts of the European Conference on Data Analysis (ECDA-18), 58-59 (2018), Germany: Paderborn, Germany
[24] Dhurandhar, A., Iyengar, V., Luss, R., & Shanmugam, K. (2017). TIP: Typifying the interpretability of procedures. arXiv preprint arXiv:1706.02952.
[25] Domingos, P., The role of Occam’s Razor in knowledge discovery, Data Mining and Knowledge Discovery, 3, 4, 409-425 (1999)
[26] Dua, D. & Karra Taniskidou, E. (2017). UCI machine learning repository. University of California, School of Information and Computer Science. http://archive.ics.uci.edu/ml.
[27] Duivesteijn, W.; Feelders, A.; Knobbe, AJ, Exceptional model mining—Supervised descriptive local pattern mining with complex target concepts, Data Mining and Knowledge Discovery, 30, 1, 47-98 (2016) · Zbl 1411.68096
[28] Elkan, C. (2001). The foundations of cost-sensitive learning. In Proceedings of the 17th international joint conference on artificial intelligence (IJCAI-01) (pp. 973-978).
[29] Ellsberg, D., Risk, ambiguity, and the savage axioms, The Quarterly Journal of Economics, 75, 4, 643-669 (1961) · Zbl 1280.91045
[30] Fantino, E.; Kulik, J.; Stolarz-Fantino, S.; Wright, W., The conjunction fallacy: A test of averaging hypotheses, Psychonomic Bulletin & Review, 4, 1, 96-101 (1997)
[31] Fayyad, U.; Piatetsky-Shapiro, G.; Smyth, P., The KDD process for extracting useful knowledge from volumes of data, Communications of the ACM, 39, 11, 27-34 (1996)
[32] Fellbaum, C., WordNet: An electronic lexical database (1998), Cambridge, MA: MIT Press, Cambridge, MA · Zbl 0913.68054
[33] Fernbach, PM; Darlow, A.; Sloman, SA, When good evidence goes bad: The weak evidence effect in judgment and decision-making, Cognition, 119, 3, 459-467 (2011)
[34] Freitas, AA, Comprehensible classification models: A position paper, SIGKDD Explorations, 15, 1, 1-10 (2013)
[35] Fürnkranz, J.; Morik, K.; Boulicaut, J-F; Siebes, A., From local to global patterns: Evaluation issues in rule learning algorithms, Local pattern detection, 20-38 (2005), Berlin: Springer, Berlin
[36] Fürnkranz, J.; Flach, PA, ROC ’n’ rule learning—Towards a better understanding of covering algorithms, Machine Learning, 58, 1, 39-77 (2005) · Zbl 1075.68071
[37] Fürnkranz, J.; Kliegr, T.; Bassiliades, N.; Gottlob, G.; Sadri, F.; Paschke, A.; Roman, D., A brief overview of rule learning, Proceedings of the 9th international symposium on rule technologies: Foundations, tools, and applications (RuleML-15), 54-69 (2015), Berlin: Springer, Berlin
[38] Fürnkranz, J.; Knobbe, AJ, Guest editorial: Global modeling using local patterns, Data Mining and Knowledge Discovery, 21, 1, 1-8 (2010)
[39] Fürnkranz, J.; Gamberger, D.; Lavrač, N., Foundations of Rule Learning (2012), Berlin: Springer, Berlin · Zbl 1263.68002
[40] Furr, RM; Bacharach, VR, Psychometrics: An introduction (2008), Thousand Oaks, CA: Sage, Thousand Oaks, CA
[41] Gabriel, A.; Paulheim, H.; Janssen, F.; Cellier, P.; Charnois, T.; Hotho, A.; Matwin, S.; Moens, M-F; Toussaint, Y., Learning semantically coherent rules, Proceedings of the ECML/PKDD-14 international workshop on interactions between data mining and natural language processing, 49-63 (2014), Nancy: CEUR Workshop Proceedings, Nancy
[42] Gall, R. (2019). Machine learning explainability vs interpretability: Two concepts that could help restore trust in AI. KDnuggets News, 19(1). https://www.kdnuggets.com/2018/12/machine-learning-explainability-interpretability-ai.html.
[43] Gamberger, D.; Lavrač, N., Active subgroup mining: A case study in coronary heart disease risk group detection, Artificial Intelligence in Medicine, 28, 1, 27-57 (2003)
[44] Ganter, B.; Wille, R., Formal concept analysis—Mathematical foundations (1999), Berlin: Springer, Berlin · Zbl 0909.06001
[45] Geier, AB; Rozin, P.; Doros, G., Unit bias a new heuristic that helps explain the effect of portion size on food intake, Psychological Science, 17, 6, 521-525 (2006)
[46] Gigerenzer, G., Simply rational: Decision making in the real world (2015), New York: Oxford University Press, New York
[47] Gigerenzer, G., Todd, P., & The ABC Group. (1999). Simple Heuristics that Make us Smart. Evolution and Cognition Series. Oxford: Oxford University Press.
[48] Gigerenzer, G.; Hertwig, R.; Pachur, T., Heuristics: The foundations of adaptive behavior (2011), New York: Oxford University Press, New York
[49] Gillies, M., Fiebrink, R., Tanaka, A., Garcia, J., Bevilacqua, F., Héloir, A., Nunnari, F., Mackay, W. E., Amershi, S., Lee, B., D’Alessandro, N., Tilmanne, J., Kulesza, T., & Caramiaux, B. (2016). Human-Centered Machine Learning. In Proceedings of the ACM conference on human factors in computing systems (CHI-16) (pp. 3558-3565). New York: ACM.
[50] Gilovich, T.; Savitsky, K.; Gilovich, T.; Griffin, D.; Kahnemann, D., Like goes with like: The role of representativeness in erroneous and pseudo-scientific beliefs, Heuristics and biases: The psychology of intuitive judgment chapter 34, 617-624 (2002), Cambridge: Cambridge University Press, Cambridge
[51] Gilovich, T.; Griffin, D.; Kahnemann, D., Heuristics and biases: The psychology of intuitive judgement (2002), New York: Cambridge University Press, New York
[52] Goldstein, D. G. & Gigerenzer, G. (1999). The recognition heuristic: How ignorance makes us smart. In Simple heuristics that make us smart (pp. 37-58). Oxford: Oxford University Press.
[53] Goldstein, DG; Gigerenzer, G., Models of ecological rationality: The recognition heuristic, Psychological Review, 109, 1, 75-90 (2002)
[54] Griffin, D.; Tversky, A., The weighing of evidence and the determinants of confidence, Cognitive Psychology, 24, 3, 411-435 (1992)
[55] Grünwald, PD, The minimum description length principle (2007), Cambridge: MIT Press, Cambridge
[56] Hahn, H., Überflüssige Wesenheiten: Occams Rasiermesser (1930), Wien: Veröffentlichungen des Vereines Ernst Mach, Wien
[57] Hahsler, M.; Chelluboina, S.; Hornik, K.; Buchta, C., The arules R-package ecosystem: Analyzing interesting patterns from large transaction data sets, Journal of Machine Learning Research, 12, 2021-2025 (2011) · Zbl 1280.68011
[58] Hasher, L.; Goldstein, D.; Toppino, T., Frequency and the conference of referential validity, Journal of Verbal Learning and Verbal Behavior, 16, 1, 107-112 (1977)
[59] Hempel, CG; Oppenheim, P., Studies in the logic of explanation, Philosophy of Science, 15, 2, 135-175 (1948)
[60] Hernández-Orallo, J., The measure of all minds—Evaluating natural and artificial intelligence (2017), Cambridge: Cambridge University Press, Cambridge
[61] Hertwig, R.; Benz, B.; Krauss, S., The conjunction fallacy and the many meanings of and, Cognition, 108, 3, 740-753 (2008)
[62] Hintzman, DL, The psychology of learning and memory (1978), Dallas: Freeman, Dallas
[63] Hu, J.; Mojsilovic, A., High-utility pattern mining: A method for discovery of high-utility item sets, Pattern Recognition, 40, 11, 3317-3324 (2007) · Zbl 1123.68360
[64] Hu, Z., Ma, X., Liu, Z., Hovy, E. H., & Xing, E.P. (2016). Harnessing deep neural networks with logic rules. In Proceedings of the 54th annual meeting of the association for computational linguistics (ACL-16), Vol. 1: Long Papers, Berlin: The Association for Computer Linguistics.
[65] Hüllermeier, E., From knowledge-based to data-driven fuzzy modeling—Development, criticism, and alternative directions, Informatik Spektrum, 38, 6, 500-509 (2015)
[66] Huysmans, J.; Dejaeger, K.; Mues, C.; Vanthienen, J.; Baesens, B., An empirical evaluation of the comprehensibility of decision table, tree and rule based predictive models, Decision Support Systems, 51, 1, 141-154 (2011)
[67] Jair Escalante, H.; Escalera, S.; Guyon, I.; Baró, X.; Güçlütürk, Y.; Güçlü, U.; van Gerven, MA J., Explainable and interpretable models in computer vision and machine learning (2018), Berlin: Springer, Berlin
[68] Japkowicz, N.; Shah, M., Evaluating learning algorithms: A classification perspective (2011), Cambridge: Cambridge University Press, Cambridge · Zbl 1230.68020
[69] Johnson-Laird, PN, Comprehension as the construction of mental models., Philosophical Transactions of the Royal Society of London Series B, Biological Sciences, 295, 353-374 (1981)
[70] Kahneman, D., A perspective on judgment and choice, American Psychologist, 58, 9, 697-720 (2003)
[71] Kahneman, D. (2011). Thinking. Straus and Giroux: Fast and Slow. Farrar. ISBN 9781429969352.
[72] Kahneman, D.; Tversky, A., On the psychology of prediction, Psychological Review, 80, 4, 237-251 (1973)
[73] Kahneman, D.; Slovic, P.; Tversky, A., Judgment under uncertainty: Heuristics and biases (1982), New York: Cambridge University Press, New York
[74] Kanouse, D. E., & Hanson, L. R, Jr. (1987). Negativity in evaluations. In Attribution: Perceiving the causes of behavior. Hillsdale: Lawrence Erlbaum Associates Inc.
[75] Kemeny, JG, The use of simplicity in induction, The Philosophical Review, 62, 3, 391-408 (1953)
[76] Kendall, M.; Gibbons, JD, Rank correlation methods (1990), London: Edward Arnold, London
[77] Keynes, JM, A treatise on probability (1922), London: Macmillan & Co., London
[78] Kijsirikul, B., Numao, M., & Shimura, M. (1992). Discrimination-based constructive induction of logic programs. In Proceedings of the 10th National Conference on Artificial Intelligence (AAAI-92) (pp. 44-49).
[79] Kim, B., Malioutov, D., & Varshney, K. (Eds.). (2016). Proceedings of the ICML-16 workshop on human interpretability in machine learning (WHI-16), New York.
[80] Kim, B.; Malioutov, DM; Varshney, KR; Weller, A., Proceedings of the ICML-17 Workshop on Human Interpretability in Machine Learning (WHI-17) (2017), Sydney: Australia, Sydney
[81] Kim, B.; Varshney, KR; Weller, A., Proceedings of the ICML-18 Workshop on Human Interpretability in Machine Learning (WHI-18) (2018), Sweden: Stockholm, Sweden
[82] Kleinberg, JM; Papadimitriou, CH; Raghavan, P., A microeconomic view of data mining, Data Mining and Knowledge Discovery, 2, 4, 311-324 (1998)
[83] Kliegr, T., Bahník, Š., & Fürnkranz, J. (2018). A review of possible effects of cognitive biases on interpretation of rule-based machine learning models. arXiv preprint arXiv:1804.02969.
[84] Kliegr, T. (2017). Effect of cognitive biases on human understanding of rule-based machine learning. Dissertation Thesis. London: Queen Mary University of London. https://qmro.qmul.ac.uk/xmlui/handle/123456789/31851
[85] Knobbe, A. J., Crémilleux, B., Fürnkranz, J., & Scholz, M. (2008). From local patterns to global models: The LeGo approach to data mining. In A. J. Knobbe (Ed.), From local patterns to global models: Proceedings of the ECML/PKDD-08 workshop (LeGo-08) (pp. 1-16), Antwerp, Belgium.
[86] Kodratoff, Y. (1994). The comprehensibility manifesto. KDnuggets, 94(9) (Guest Editor’s Introduction, AI Communications, 7(2), 83-85).
[87] Kok, S., & Domingos, P. M. (2007). Statistical predicate invention. In Z. Ghahramani (Ed.), Proceedings of the 24th international conference on machine learning (ICML-07) (pp. 433-440), Corvallis: ACM.
[88] Kononenko, I., Inductive and Bayesian learning in medical diagnosis, Applied Artificial Intelligence, 7, 317-337 (1993)
[89] Kralj Novak, P.; Lavrač, N.; Webb, GI, Supervised descriptive rule discovery: A unifying survey of contrast set, emerging pattern and subgroup mining, Journal of Machine Learning Research, 10, 377-403 (2009) · Zbl 1235.68178
[90] Lakkaraju, H., Bach, S. H., & Leskovec, J. (2016). Interpretable decision sets: A joint framework for description and prediction. In B. Krishnapuram, M. Shah, A. J. Smola, C. C. Aggarwal, D. Shen, & R. Rastogi (Eds.), Proceedings of the 22nd ACM SIGKDD international conference on knowledge discovery and data mining (KDD-16) (pp. 1675-1684). San Francisco, CA: ACM.
[91] LeCun, Y.; Bengio, Y.; Hinton, G., Deep learning, Nature, 521, 7553, 436-444 (2015)
[92] Li, M.; Vitányi, P., An introduction to Kolmogorov complexity and its applications (1993), Berlin: Springer, Berlin · Zbl 0805.68063
[93] Lincoff, GH, The Audubon society field guide to North American mushrooms (1981), New York: Knopf, New York
[94] Lipton, Z. C. (2016). The mythos of model interpretability. Presented at 2016 ICML Workshop on Human Interpretability in Machine Learning (WHI-16), New York, NY. arXiv preprint arXiv:1606.03490.
[95] Markou, M.; Singh, S., Novelty detection: A review—Part 1: Statistical approaches, Signal Processing, 83, 12, 2481-2497 (2003) · Zbl 1145.94402
[96] Markou, M.; Singh, S., Novelty detection: A review—Part 2: Neural network based approaches, Signal Processing, 83, 12, 2499-2521 (2003) · Zbl 1145.94403
[97] Martens, D., & Baesens, B., (2010). Building acceptable classification models. In R. Stahlbock, S. F. Crone, & S. Lessmann (Eds.), Data mining, Vol. 8 of annals of information systems (pp. 53-74). Berlin: Springer.
[98] Martens, D.; Provost, FJ, Explaining data-driven document classifications, MIS Quarterly, 38, 1, 73-99 (2014)
[99] Martens, D.; Vanthienen, J.; Verbeke, W.; Baesens, B., Performance of classification models from a user perspective, Decision Support Systems, 51, 4, 782-793 (2011)
[100] Martire, KA; Kemp, RI; Watkins, I.; Sayle, MA; Newell, BR, The expression and interpretation of uncertain forensic science evidence: verbal equivalence, evidence strength, and the weak evidence effect, Law and Human Behavior, 37, 3, 197-207 (2013)
[101] Matheus, C.J. (1989). A constructive induction framework. In Proceedings of the 6th international workshop on machine learning (pp. 474-475).
[102] Mayes, G. R. (2001). Theories of explanation. Internet Encyclopedia of Philosophy. https://www.iep.utm.edu/explanat/.
[103] Mehta, M., Rissanen, J., & Agrawal, R. (1995). MDL-based decision tree pruning. In U. Fayyad, & Uthurusamy, R. (Eds.), Proceedings of the 1st international conference on knowledge discovery and data mining (KDD-95) (pp. 216-221). AAAI Press.
[104] Michalski, RS, A theory and methodology of inductive learning, Artificial Intelligence, 20, 2, 111-162 (1983)
[105] Mikolov, T., Sutskever, I., Chen, K., Corrado, G. S., & Dean, J. (2013). Distributed representations of words and phrases and their compositionality. In C. Burges, L. Bottou, M. Welling, Z. Ghahramani, & K. Weinberger (Eds.) Advances in neural information processing systems 26 (NIPS) (pp. 3111-3119).
[106] Miller, T., Explanation in artificial intelligence: Insights from the social sciences, Artificial Intelligence, 267, 1, 1-38 (2019) · Zbl 1478.68274
[107] Minnaert, B.; Martens, D.; Backer, MD; Baesens, B., To tune or not to tune: Rule evaluation for metaheuristic-based sequential covering algorithms, Data Mining and Knowledge Discovery, 29, 1, 237-272 (2015)
[108] Mitchell, TM, Machine Learning (1997), New York: McGraw Hill, New York
[109] Molnar, C. (2019). Interpretable machine learning—A guide for making black box models explainable. https://christophm.github.io/interpretable-ml-book/.
[110] Morik, K.; Wrobel, S.; Kietz, J-U; Emde, W., Knowledge acquisition and machine learning—Theory, methods, and applications (1993), London: Academic Press, London
[111] Muggleton, SH; Bratko, I.; Lavrač, N., Structuring knowledge by asking questions, Progress in machine learning, 218-229 (1987), Wilmslow: Sigma Press, Wilmslow
[112] Muggleton, S. H., & Buntine, W. L. (1988). Machine invention of first-order predicates by inverting resolution. In Proceedings of the 5th international conference on machine learning (ML-88) (pp. 339-352).
[113] Muggleton, SH; Lin, D.; Tamaddoni-Nezhad, A., Meta-interpretive learning of higher-order dyadic Datalog: Predicate invention revisited, Machine Learning, 100, 1, 49-73 (2015) · Zbl 1346.68119
[114] Muggleton, SH; Schmid, U.; Zeller, C.; Tamaddoni-Nezhad, A.; Besold, T., Ultra-strong machine learning: comprehensibility of programs learned with ILP, Machine Learning, 107, 7, 1119-1140 (2018) · Zbl 1461.68191
[115] Müller, K.-R., Vedaldi, A., Hansen, L. K., Samek, W., & Motavon, G. (eds.). (2017). Proceedings of the NIPS-17 workshop on interpreting, explaining and visualizing deep learning ... now what? Long Beach.
[116] Munroe, R. (2013). Kolmogorov directions. xkcd.com, A webcomic of romance, sarcasm, math, and language. https://xkcd.com/1155/.
[117] Murphy, PM; Pazzani, MJ, Exploring the decision forest: An empirical investigation of Occam’s Razor in decision tree induction, Journal of Artificial Intelligence Research, 1, 257-275 (1994) · Zbl 0900.68386
[118] Needham, S. L., & Dowe, D. L. (2001). Message length as an effective Ockham’s Razor in decision tree induction. Proceedings of the 8th International Workshop on Artificial Intelligence and Statistics (AI+STATS-01), Key West, FL (pp. 253-260).
[119] Newson, R., Parameters behind “nonparametric” statistics: Kendall’s tau, Somers’ d and median differences, The Stata Journal, 2, 45-64 (2002)
[120] Nickerson, RS, Confirmation bias: A ubiquitous phenomenon in many guises, Review of General Psychology, 2, 2, 175-220 (1998)
[121] Pachur, T.; Hertwig, R., On the psychology of the recognition heuristic: Retrieval primacy as a key determinant of its use, Journal of Experimental Psychology: Learning, Memory, and Cognition, 32, 5, 983-1002 (2006)
[122] Page, L., Brin, S., Motwani, R., & Winograd, T. (1999). The PageRank citation ranking: Bringing order to the web. Technical Report 1999-66, Stanford InfoLab.
[123] Paolacci, G.; Chandler, J., Inside the Turk: Understanding Mechanical Turk as a participant pool, Current Directions in Psychological Science, 23, 3, 184-188 (2014)
[124] Paolacci, G.; Chandler, J.; Ipeirotis, PG, Running experiments on Amazon Mechanical Turk, Judgment and Decision Making, 5, 5, 411-419 (2010)
[125] Paulheim, H. (2012a). Generating possible interpretations for statistics from linked open data. In Proceedings of the 9th Extended Semantic Web Conference (ESWC-12) (pp 560-574). Berlin: Springer.
[126] Paulheim, H. (2012b). Nobody wants to live in a cold city where no music has been recorded—Analyzing statistics with explain-a-LOD. In The Semantic Web: ESWC 2012 Satellite Events. Heraklion, Crete, Greece. Revised Selected Papers (pp. 560-574). Springer.
[127] Paulheim, H. & Fürnkranz, J. (2012). Unsupervised generation of data mining features from linked open data. In Proceedings of the 2nd International Conference on Web Intelligence, Mining and Semantics (WIMS-12) (pp. 31:1-31:12). ACM.
[128] Pazzani, MJ, Knowledge discovery from data?, IEEE Intelligent Systems and their Applications, 15, 2, 10-12 (2000)
[129] Pazzani, MJ; Mani, S.; Shankle, WR, Acceptance of rules generated by machine learning among medical experts, Methods of Information in Medicine, 40, 5, 380-385 (2001)
[130] Peharz, R.; Gens, R.; Pernkopf, F.; Domingos, PM, On the latent variable interpretation in sum-product networks, IEEE Transactions on Pattern Analysis and Machine Intelligence, 39, 10, 2030-2044 (2017)
[131] Pfahringer, B. (1994). Controlling constructive induction in CiPF: an MDL approach. In P. B. Brazdil (Ed.), Proceedings of the 7th European Conference on Machine Learning (ECML-94) (pp. 242-256). Catania: Springer.
[132] Pfahringer, B. (1995). A new MDL measure for robust rule induction (extended abstract). In Proceedings of the 8th European conference on machine learning (ECML-95) (pp. 331-334). Berlin: Springer.
[133] Piatetsky-Shapiro, G. (2018). Will GPDR make machine learning illegal? KDnuggets, 18(12). https://www.kdnuggets.com/2018/03/gdpr-machine-learning-illegal.html.
[134] Piltaver, R.; Luštrek, M.; Gams, M.; Martinčić-Ipšić, S., What makes classification trees comprehensible?, Expert Systems with Applications, 62, 333-346 (2016)
[135] Plous, S., The Psychology of Judgment and Decision Making (1993), New York: McGraw-Hill Book Company, New York
[136] Pohl, R., Cognitive illusions: A handbook on fallacies and biases in thinking, judgement and memory (2017), London: Psychology Press, London
[137] Popper, KR, Logik der Forschung: zur Erkenntnistheorie der modernen Naturwissenschaft (1935), Berlin: Verlag von Julius Springer, Berlin · JFM 61.0977.04
[138] Popper, KR, The logic of scientific discovery (1959), London: Hutchinson & Co, London · Zbl 0083.24104
[139] Post, H., Simplicity in scientific theories, The British Journal for the Philosophy of Science, 11, 41, 32-41 (1960)
[140] Quinlan, JR, Learning logical definitions from relations, Machine Learning, 5, 239-266 (1990)
[141] Ribeiro, M. T., Singh, S., & Guestrin, C. (2016). “Why should I trust you?”: Explaining the predictions of any classifier. In B. Krishnapuram, M. Shah, A. J. Smola, C. Aggarwal, D. Shen, & R. Rastogi (Eds.) Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD-16) (pp. 1135-1144), San Francisco, CA: ACM.
[142] Rissanen, J., Modeling by shortest data description, Automatica, 14, 465-471 (1978) · Zbl 0418.93079
[143] Ristoski, P., de Vries, G. K. D., & Paulheim, H. (2016). A collection of benchmark datasets for systematic evaluations of machine learning on the semantic web. In P. T. Groth, E. Simperl, A. J. G. Gray, M. Sabou, M. Krötzsch, F. Lécué, F. Flöck, & Gil, Y. (Eds.), Proceedings of the 15th international semantic web conference (ISWC-16), Part II (pp. 186-194), Kobe, Japan.
[144] Ristoski, P., & Paulheim, H. (2013). Analyzing statistics with background knowledge from linked open data. In S. Capadisli, F. Cotton, R. Cyganiak, A. Haller, A. Hamilton, & R. Troncy (Eds.), Proceedings of the 1st international workshop on semantic statistics (SemStats-13). CEUR workshop proceedings 1549, Sydney, Australia.
[145] Robinson, WS, Ecological correlations and the behavior of individuals, American Sociological Review, 15, 3, 351-337 (1950)
[146] Rothe, S., & Schütze, H. (2016). Word embedding calculus in meaningful ultradense subspaces. In Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (ACL-15) (pp. 512-517). Stroudsburg: ACL.
[147] Schmid, U., Zeller, C., Besold, T., Tamaddoni-Nezhad, A., & Muggleton, S. (2017). How does predicate invention affect human comprehensibility? In J. Cussens, & A. Russo (Eds.), Proceedings of the 26th international conference on inductive logic programming (ILP-16) (pp. 52-67). London: Springer.
[148] Schmidhuber, J., Deep learning in neural networks: An overview, Neural Networks, 61, 85-117 (2015)
[149] Schmitz, GPJ; Aldrich, C.; Gouws, FS, ANN-DT: An algorithm for extraction of decision trees from artificial neural networks, IEEE Transactions on Neural Networks, 10, 6, 1392-1401 (1999)
[150] Shafer, G., A mathematical theory of evidence (1976), Princeton: Princeton University Press, Princeton · Zbl 0359.62002
[151] Sides, A.; Osherson, D.; Bonini, N.; Viale, R., On the reality of the conjunction fallacy, Memory & Cognition, 30, 2, 191-198 (2002)
[152] Smith, EE; Shoben, EJ; Rips, LJ, Structure and process in semantic memory: A featural model for semantic decisions, Psychological Review, 1, 214-241, 1974 (1974)
[153] Sommer, E. (1996). Theory Restructuring—A Perspective on Design and Maintenance of Knowlege Based Systems. Doctoral thesis, Technical University of Dortmund, Germany, volume 171 of DISKI. Infix. · Zbl 0864.68102
[154] Stahl, I.; De Raedt, L., Predicate invention in Inductive Logic Programming, Advances in Inductive Logic Programming, 34-47 (1996), Amsterdam: IOS Press, Amsterdam
[155] Stecher, J., Janssen, F., & Fürnkranz, J. (September 2014). Separating rule refinement and rule selection heuristics in inductive rule learning. In T. Calders, F. Esposito, E. Hüllermeier, & R. Meo (Eds.) Proceedings of the European conference on machine learning and principles and practice of knowledge discovery in databases (ECML-PKDD-14), Part 3 (pp. 114-129). Nancy: Springer.
[156] Stecher, J., Janssen, F., & Fürnkranz, J. (2016). Shorter rules are better, aren’t they? In T. Calders, M. Ceci, & D. Malerba (Eds.), Proceedings of the 19th international conference on discovery science (DS-16) (pp. 279-294). Berlin: Springer.
[157] Stumme, G.; Taouil, R.; Bastide, Y.; Pasquier, N.; Lakhal, L., Computing iceberg concept lattices with Titanic, Data and Knowledge Engineering, 42, 2, 189-222 (2002) · Zbl 0996.68046
[158] Tentori, K.; Crupi, V., On the conjunction fallacy and the meaning of and yet again: A reply to Hertwig, Benz, and Krauss (2008), Cognition, 122, 2, 123-134 (2012)
[159] Thorndike, EL, The influence of primacy, Journal of Experimental Psychology, 10, 1, 18-29 (1927)
[160] Todorovski, L., Flach, P., & Lavrač, N. (2000). Predictive performance of weighted relative accuracy. In Proceedings of the 4th European symposium on principles of data mining and knowledge discovery (PKDD-2000) (pp. 255-264).
[161] Tosi, A., Vellido, A., & Alvarez, M. (eds.). (2017). Proceedings of the NIPS-17 workshop on transparent and interpretable machine learning in safety critical environments (TIML-17), Long Beach.
[162] Thagard, P., Explanatory coherence, Behavioral and Brain Sciences, 12, 3, 435-467 (1989)
[163] Tran, SN; d’Avila Garcez, AS, Deep logic networks: Inserting and extracting knowledge from deep belief networks, IEEE Transactions on Neural Networks and Learning Systems, 29, 2, 246-258 (2018)
[164] Tversky, A.; Kahneman, D., Belief in the law of small numbers, Psychological Bulletin, 76, 2, 105-110 (1971)
[165] Tversky, A.; Kahneman, D., Availability: A heuristic for judging frequency and probability, Cognitive Psychology, 5, 2, 207-232 (1973)
[166] Tversky, A.; Kahneman, D., Judgment under uncertainty: Heuristics and biases, Science, 185, 4157, 1124-1131 (1974)
[167] Tversky, A.; Kahneman, D., Extensional versus intuitive reasoning: The conjunction fallacy in probability judgment, Psychological review, 90, 4, 293-315 (1983)
[168] Tversky, A.; Simonson, I., Context-dependent preferences, Management Science, 39, 10, 1179-1189 (1993) · Zbl 0800.90037
[169] Valmarska, A.; Lavrač, N.; Fürnkranz, J.; Robnik-Sikonja, M., Refinement and selection heuristics in subgroup discovery and classification rule learning, Expert Systems with Applications, 81, 147-162 (2017)
[170] van den Eijkel, G.; Berthold, M.; Hand, D., Rule induction, Intelligent data analysis: An introduction, 195-216 (1999), Berlin: Springer, Berlin
[171] Van Fraassen, BC, The pragmatics of explanation, American Philosophical Quarterly, 14, 2, 143-150 (1977)
[172] Vreeken, J.; van Leeuwen, M.; Siebes, A., Krimp: Mining itemsets that compress, Data Mining and Knowledge Discovery, 23, 1, 169-214 (2011) · Zbl 1235.68071
[173] Wallace, CS; Boulton, DM, An information measure for classification, Computer Journal, 11, 185-194 (1968) · Zbl 0164.46208
[174] Wang, S., Huang, C., Yao, Y., & Chan, A. (2015). Mechanical Turk-based experiment vs laboratory-based experiment: A case study on the comparison of semantic transparency rating data. In Proceedings of the 29th Pacific Asia conference on language, information and computation (PACLIC-15), Shanghai: ACL.
[175] Wang, T.; Rudin, C.; Doshi-Velez, F.; Liu, Y.; Klampfl, E.; MacNeille, P., A Bayesian framework for learning rule sets for interpretable classification, Journal of Machine Learning Research, 18, 70:1-70:37 (2017) · Zbl 1434.68467
[176] Webb, GI, Further experimental evidence against the utility of Occam’s razor, Journal of Artificial Intelligence Research, 4, 397-417 (1996) · Zbl 0900.68367
[177] Webb, GI, Discovering significant patterns, Machine Learning, 68, 1, 1-33 (2007)
[178] Weihs, C.; Sondhauss, UM; Schwaiger, M.; Opitz, O., Combining mental fit and data fit for classification rule selection, Exploratory data analysis in empirical research. Studies in classification, data analysis, and knowledge organization, 188-203 (2003), Berlin: Springer, Berlin
[179] Wille, R.; Rival, I., Restructuring lattice theory: An approach based on hierarchies of concepts, Ordered Sets, 445-470 (1982), Dordrecht-Boston: Reidel, Dordrecht-Boston
[180] Wilson, A. G., Kim, B., & Herland, W. (Eds.) (2016). Proceedings of the NIPS-16 workshop on interpretable machine learning for complex systems. Barcelona, Spain.
[181] Wnek, J.; Michalski, RS, Hypothesis-driven constructive induction in AQ17-HCI: A method and experiments, Machine Learning, 14, 2, 139-168 (1994) · Zbl 0804.68125
[182] Zajonc, RB, Attitudinal effects of mere exposure, Journal of Personality and Social Psychology, 9, 2-2, 1-27 (1968)
[183] Zaki, M. J., & Hsiao, C.-J. (2002). CHARM: An efficient algorithm for closed itemset mining. In R. L. Grossman, J. Han, V. Kumar, H. Mannila, & R. Motwani (Eds.), Proceedings of the 2nd SIAM international conference on data mining (SDM-02). Arlington, VA.
[184] Zeiler, M. D. & Fergus, R. (2014). Visualizing and understanding convolutional networks. In D. Fleet, T. Pajdla, B. Schiele, & T. Tuytelaars (Eds.), Proceedings of the 13th European conference on computer vision (ECCV-14) (pp. 818-833). Berlin: Springer.
[185] Zilke, J. R., Loza Mencía, E., & Janssen, F. (2016). DeepRED—Rule extraction from deep neural networks. In T. Calders, M. Ceci, & D. Malerba (Eds.), Proceedings of the 19th international conference on discovery science (DS-16) (pp. 457-473). Bari: Springer.
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. It attempts to reflect the references listed in the original paper as accurately as possible without claiming the completeness or perfect precision of the matching.