zbMATH — the first resource for mathematics

White-box induction from SVM models: explainable AI with logic programming. (English) Zbl 07284962
Summary: We focus on the problem of inducing logic programs that explain models learned by the support vector machine (SVM) algorithm. The top-down sequential covering inductive logic programming (ILP) algorithms (e.g., FOIL) apply hill-climbing search using heuristics from information theory. A major issue with this class of algorithms is getting stuck in local optima. In our new approach, however, the data-dependent hill-climbing search is replaced with a model-dependent search where a globally optimal SVM model is trained first, then the algorithm looks into support vectors as the most influential data points in the model, and induces a clause that would cover the support vector and points that are most similar to that support vector. Instead of defining a fixed hypothesis search space, our algorithm makes use of SHAP, an example-specific interpreter in explainable AI, to determine a relevant set of features. This approach yields an algorithm that captures the SVM model’s underlying logic and outperforms other ILP algorithms in terms of the number of induced clauses and classification evaluation metrics.
68N17 Logic programming
Full Text: DOI
[1] Aggarwal, Charu C. and Jiawei (editors) Han. 2014. Frequent Pattern Mining. Springer International Publishing. · Zbl 1297.68010
[2] Rakesh, Agrawal and Srikant, Ramakrishnan. 1994. Fast Algorithms for Mining Association Rules in Large Databases. In Proceedings of the 20th International Conference on Very Large Data Bases (VLDB ’94). Morgan Kaufmann Publishers Inc., San Francisco, CA, USA, 487-499.
[3] Baral, Chitta. 2003. Knowledge representation, reasoning and declarative problem solving. Cambridge University Press. · Zbl 1056.68139
[4] Tianqi, Chen and Guestrin, Carlos. 2016. XGBoost: A Scalable Tree Boosting System. In Proceedings of the 22nd ACM SIGKDD (San Francisco, California, USA) (KDD ’16). 785-794.
[5] Cohen, William W.. 1995. Fast Effective Rule Induction. In Proceedings of the Twelfth International Conference on International Conference on Machine Learning (Tahoe City, California, USA) (ICML’95). Morgan Kaufmann Publishers Inc., San Francisco, CA, USA, 115-123.
[6] Corinna, Cortes and Vapnik, Vladimir. 1995. Support-Vector Networks. Mach. Learn. 20, 3 (Sept. 1995), 273-297. https://doi.org/10.1023/A:1022627411411 · Zbl 0831.68098
[7] Craven, Mark W. and Shavlik, Jude W.. 1995. Extracting Tree-structured Representations of Trained Networks. In Proceedings of the 8th International Conference on Neural Information Processing Systems (Denver, Colorado) (NIPS’95). MIT Press, Cambridge, MA, USA, 24-30.
[8] Diederich, Joachim. 2008. Rule Extraction from Support Vector Machines: An Introduction. Springer Berlin Heidelberg, Berlin, Heidelberg, 3-31. · Zbl 1148.68431
[9] Dheeru, Dua and Graff, Casey. 2017. UCI Machine Learning Repository. http://archive.ics.uci.edu/ml
[10] Philippe Fournier-Viger, Jerry Chun-Wei Lin, Tin Truong-Chi, and Roger Nkambou. 2019. A Survey of High Utility Itemset Mining. In High-Utility Pattern Mining: Theory, Algorithms and Applications. Springer International Publishing, 1-45. https://doi.org/10.1007/978-3-030-04921-8_1
[11] Philippe Fournier-Viger, Cheng-Wei Wu, Souleymane Zida, and Vincent S. Tseng. 2014. FHM: Faster High-Utility Itemset Mining Using Estimated Utility Co-occurrence Pruning. In Foundations of Intelligent Systems, Troels, Andreasen, Henning, Christiansen, Juan-Carlos, Cubero, and Raś, Zbigniew W. (Eds.). Springer International Publishing, 83-92.
[12] Glenn, Fung, Sathyakama, Sandilya, and Bharat Rao, R.. 2005. Rule Extraction from Linear Support Vector Machines. In Proceedings of the Eleventh ACM SIGKDD International Conference on Knowledge Discovery in Data Mining (Chicago, Illinois, USA) (KDD ’05). ACM, New York, NY, USA, 32-40. · Zbl 1148.68433
[13] Wensheng, Gan, Jerry Chun-Wei, Lin, Philippe Fournier-Viger, Han-Chieh Chao, Tzung-Pei Hong, and Hamido Fujita. 2018. A survey of incremental high-utility itemset mining. Wiley Interdiscip. Rev. Data Min. Knowl. Discov.8, 2 (2018).
[14] Michael, Gelfond and Kahl, Yulia. 2014. Knowledge representation, reasoning, and the design of intelligent agents: The answer-set programming approach. Cambridge University Press.
[15] Gunning, David. 2015. Explainable Artificial Intelligence (XAI), https://www.darpa.mil/program/explainable-artificial-intelligence. Retrieved June 2018.
[16] Mark, Hall, Eibe, Frank, Geoffrey, Holmes, Bernhard, Pfahringer, Peter, Reutemann, and Witten, Ian H.. 2009. The WEKA data mining software: an update. SIGKDD Explorations 11, 1 (2009), 10-18. · Zbl 1242.68001
[17] Johan, Huysmans, Bart, Baesens, and Vanthienen, Jan. 2006. ITER: an algorithm for predictive regression rule extraction. In International Conference on Data Warehousing and Knowledge Discovery. Springer, Krakow, Poland, 270-279.
[18] Johan, Huysmans, Rudy, Setiono, Bart, Baesens, and Vanthienen, Jan. 2008. Minerva: Sequential covering for rule extraction. IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics) 38, 2 (2008), 299-309. · Zbl 1148.68439
[19] Niels, Landwehr, Kristian, Kersting, and De Raedt, Luc. 2005. nFOIL: Integrating Nave Bayes and FOIL. In Proceedings, The Twentieth National Conference on Artificial Intelligence and the Seventeenth Innovative Applications of Artificial Intelligence Conference, July 9-13, 2005, Pittsburgh, Pennsylvania, USA. 795-800. · Zbl 1222.68242
[20] Niels, Landwehr, Andrea, Passerini, Luc De, Raedt, and Frasconi, Paolo. 2006. kFOIL: Learning Simple Relational Kernels. In Proceedings, The Twenty-First National Conference on Artificial Intelligence and the Eighteenth Innovative Applications of Artificial Intelligence Conference, July 16-20, 2006, Boston, Massachusetts, USA. 389-394.
[21] Lundberg, Scott M, Gabriel G Erion, and Su-In Lee. 2018. Consistent individualized feature attribution for tree ensembles. arXiv preprint arXiv:1802.03888 (2018).
[22] Scott M Lundberg and Su-In Lee. 2017. A Unified Approach to Interpreting Model Predictions. In Advances in Neural Information Processing Systems 30. Curran Associates, Inc., Long Beach, CA, 4765-4774.
[23] David Martens, Johan Huysmans, Rudy Setiono, Jan Vanthienen, and Bart Baesens. 2008. Rule Extraction from Support Vector Machines: An Overview of Issues and Application in Credit Scoring. In Rule Extraction from Support Vector Machines, Joachim, Diederich (Ed.). Springer Berlin Heidelberg, Berlin, Heidelberg, 33-63. · Zbl 1148.68439
[24] Molnar, Christoph. 2019. Interpretable Machine Learning: A Guide for Making Black Box Models Explainable. https://christophm.github.io/interpretable-ml-book/.
[25] Muggleton, Stephen. 1991. Inductive Logic Programming. New Gen. Comput. 8, 4 (Feb. 1991), 295-318. · Zbl 0712.68022
[26] Stephen Muggleton, Luc de Raedt, David Poole, Ivan Bratko, Peter Flach, Katsumi Inoue, and Ashwin Srinivasan. 2012. ILP Turns 20. Mach. Learn. 86, 1 (Jan. 2012), 3-23.
[27] Stephen Muggleton, Huma Lodhi, Ata Amini, and Michael J. E. Sternberg. 2005. Support Vector Inductive Logic Programming. In Discovery Science, Achim, Hoffmann, Hiroshi, Motoda, and Tobias, Scheffer (Eds.). Springer Berlin Heidelberg, Berlin, Heidelberg. · Zbl 1088.68823
[28] Haydemar, Nuñez, Cecilio, Angulo, and Català, Andreu. 2002. Rule extraction from SVM. In Proc. The European Symposium on Artificial Neural Networks. Bruges, Belgium, 107-112. · Zbl 1036.68543
[29] Plotkin, G. D.. 1971. A further note on inductive generalization. In Machine Intelligence (1971), Vol. 6. Edinburgh University Press, pages 101-124. · Zbl 0261.68042
[30] Ross Quinlan, J.. 1990. Learning Logical Definitions from Relations. Machine Learning5 (1990), 239-266.
[31] Ross Quinlan, J.. 1993. C4.5: Programs for Machine Learning. Morgan Kaufmann Publishers Inc., San Francisco, CA, USA.
[32] Marco Tulio, Ribeiro, Sameer, Singh, and Guestrin, Carlos. 2016. “Why Should I Trust You?”: Explaining the Predictions of Any Classifier. In Proceedings of the 22nd ACM SIGKDD 2016. 1135-1144.
[33] Riguzzi, F.. 2016. ALEPH in SWI-Prolog. https://github.com/friguzzi/aleph
[34] Sakama, Chiaki. 2005. Induction from answer sets in nonmonotonic logic programs. ACM Trans. Comput. Log. 6, 2 (2005), 203-231. · Zbl 1367.68039
[35] Farhad, Shakerin and Gupta, Gopal. 2020. Whitebox Induction of Default Rules Using High-Utility Itemset Mining. In Practical Aspects of Declarative Languages - 22nd International Symposium, PADL 2020, New Orleans, LA, USA, January 20-21, 2020, Proceedings (Lecture Notes in Computer Science, Vol. 12007), Ekaterina, Komendantskaya and Yanhong Annie, Liu (Eds.). Springer, 168-176. https://doi.org/10.1007/978-3-030-39197-3_11
[36] Farhad, Shakerin, Elmer, Salazar, and Gupta, Gopal. 2017. A new algorithm to automate inductive learning of default theories. TPLP 17, 5-6 (2017), 1010-1026. · Zbl 1422.68029
[37] Srinivasan, Ashwin. 2001. The Aleph Manual. http://web.comlab.ox.ac.uk/oucl/research/areas/machlearn/Aleph/
[38] Tseng, Vincent S, Cheng-Wei, Wu, Philippe Fournier-Viger, and S Yu Philip. 2016. Efficient algorithms for mining top-k high utility itemsets. IEEE Trans. on Know. and Data Engg. 28, 1 (2016), 54-67.
[39] Wielemaker, Jan. 2020. The SWI-Prolog System. http://www.swi-prolog.org/ · Zbl 1204.68063
[40] Qiang, Zeng, Patel, Jignesh M., and Page, David. 2014. QuickFOIL: Scalable Inductive Logic Programming. Proc. VLDB Endow. 8, 3 (Nov. 2014), 197-208.
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. It attempts to reflect the references listed in the original paper as accurately as possible without claiming the completeness or perfect precision of the matching.