zbMATH — the first resource for mathematics

Knowledge representation analysis of graph mining. (English) Zbl 07093478
Summary: This paper analyses the graph mining problem, and the frequent pattern mining task associated with it. In general, frequent pattern mining looks for a graph which occurs frequently within a network or, in the transactional setting, within a dataset of graphs. We discuss this task in the transactional setting, which is a problem of interest in many fields such as bioinformatics, chemoinformatics and social networks. We look at the graph mining problem from a Knowledge Representation point of view, hoping to learn something about support for higher-order logics in declarative languages and solvers. Graph mining is studied as a prototypical problem; it is easily expressible mathematically and exists in many variations. As such, it appears to be a prime candidate for a declarative approach; one would expect this allows for a clear, structured, statement of the problem combined with easy adaptation to changing requirements and variations. Current state-of-the-art KR languages such as IDP and ASP aspire to be practical solvers for such problems (Bruynooghe, Theory Practice Logic Program. (TPLP) 15(6), 783-817 2015). Nevertheless, expressing the graph mining problem in these languages requires unexpectedly complicated and unintuitive encoding techniques. These techniques are in contrast to the ease with which one can transform the mathematical definition of graph mining to a higher-order logic specification, and distract from the problem essentials, complicating possible future adaptation. In this paper, we argue that efforts should be made towards supporting higher-order logic specifications in modern specification languages, without unintuitive and complicated encoding techniques. We argue that this not only makes representation clearer and more susceptible to future adaptation, but might also allow for faster, more competitive solver techniques to be implemented.
68T30 Knowledge representation
Full Text: DOI
[1] Abramson, H., Rogers, H.: Meta-Programming in Logic Programming. MIT Press (1989)
[2] Abrial, J.R.: The B-Book. Cambridge University Press. https://doi.org/10.1017/CBO9780511624162 (1996) · Zbl 0915.68015
[3] Abrial, J.R.: Modeling in Event-B: System and Software Engineering. Cambridge University Press (2010) · Zbl 1213.68214
[4] Aoga, J.O.R., Guns, T., Schaus, P.: An efficient algorithm for mining frequent sequence with constraint programming. In: Frasconi, P., Landwehr, N., Manco, G., Vreeken, J. (eds.) Machine Learning and Knowledge Discovery in Databases - European Conference, ECML PKDD 2016, Riva del Garda, Italy, September 19-23, 2016, Proceedings, Part II, Lecture Notes in Computer Science, vol. 9852, pp 315-330. Springer (2016), https://doi.org/10.1007/978-3-319-46227-1_20
[5] Babai, L.: Graph isomorphism in quasipolynomial time. CoRR 1512.03547 (2015) · Zbl 1376.68058
[6] Bogaerts, B., Janhunen, T., Tasharrofi, S.: Solving QBF instances with nested SAT solvers. In: Darwiche, A. (ed.) Beyond NP, Papers from the 2016 AAAI Workshop, Phoenix, Arizona, USA, February 12, 2016., AAAI Workshops. http://www.aaai.org/ocs/index.php/WS/AAAIW16/paper/view/12603, vol. WS-16-05. AAAI Press (2016) · Zbl 1379.68044
[7] Bowen, J.P.: Formal Specification and Documentation using Z. International Thomson Computer Press (1996) · Zbl 0875.68629
[8] Brewka, G., Delgrande, J.P., Romero, J., Schaub, T.: asprin: Customizing answer set preferences without a headache. In: AAAI, pp. 1467-1474. AAAI Press (2015)
[9] Bruynooghe, M.; Blockeel, H.; Bogaerts, B.; Cat, B.; Pooter, SD; Jansen, J.; Labarre, A.; Ramon, J.; Denecker, M.; Verwer, S., Predicate logic as a modeling language: modeling and solving some machine learning and data mining problems with IDP3, Theory Practice Logic Program. (TPLP), 15, 783-817, (2015) · Zbl 1379.68279
[10] Cat, B.; Denecker, M.; Bruynooghe, M.; Stuckey, PJ, Lazy model expansion: Interleaving grounding with search, J. Artif. Intell. Res., 52, 235-286, (2015) · Zbl 1323.68464
[11] Chen, W.; Kifer, M.; Warren, DS, Hilog: A foundation for higher-order logic programming, J. Logic Program., 15, 187-230, (1993) · Zbl 0787.68017
[12] Cuteri, B.; Dodaro, C.; Ricca, F.; Schu̇ller, P., Constraints, lazy constraints, or propagators in ASP solving: An empirical analysis, Theory Pract. Logic Program. (TPLP), 17, 780-799, (2017) · Zbl 06803822
[13] Dasseville, I.; Hallen, M.; Janssens, G.; Denecker, M., Semantics of templates in a compositional framework for building logics, Theory Pract. Logic Program. (TPLP), 15, 681-695, (2015) · Zbl 1379.68092
[14] De Cat, B., Bogaerts, B., Bruynooghe, M., Janssens, G., Denecker, M.: Predicate logic as a modelling language: The IDP system. CoRR 1401.6312v2 (2016)
[15] De Raedt, L., Guns, T., Nijssen, S.: Constraint programming for itemset mining. In: ACM SIGKDD, pp. 204-212 (2008) · Zbl 1353.68233
[16] Eiter, T.; Fink, M.; Ianni, G.; Krennwallner, T.; Redl, C.; Schu̇ller, P., A model building framework for answer set programming with external computations, Theory Pract. Logic Program. (TPLP), 16, 418-464, (2016) · Zbl 1379.68058
[17] Eiter, T., Ianni, G., Krennwallner, T.: Answer set programming: A primer. In: Reasoning Web, Lecture Notes in Computer Science, vol. 5689, pp. 40-110. Springer (2009) · Zbl 1254.68248
[18] Gebser, M., Kaminski, R., Kaufmann, B., Schaub, T.: Answer Set Solving in Practice. Synthesis Lectures on Artificial Intelligence and Machine Learning. Morgan and Claypool Publishers (2012) · Zbl 1251.68060
[19] Gebser, M., Kaminski, R., Kaufmann, B., Schaub, T.: Clingo = ASP + control: Preliminary report. CoRR 1405.3694 (2014)
[20] Gebser, M., Kaufmann, B., Schaub, T.: Solution enumeration for projected boolean search problems. In: Constraint Programming, Artificial Intelligence and Operations Research (CPAIOR), Lecture Notes in Computer Science, vol. 5547, pp. 71-86. Springer (2009) · Zbl 1241.68100
[21] Guyet, T., Moinard, Y., Quiniou, R., Schaub, T.: Efficiency Analysis of ASP Encodings for Sequential Pattern Mining Tasks, pp 41-81. Springer International Publishing, Cham (2018)
[22] van der Hallen, M., Janssens, G.: A grounder from second-order logic to qbf. In: Quantified Boolean Formulas, Papers from the 2018 FLoC Quantified Boolean Formulas and Beyond Workshop, Oxford, England, July 8, 2018 (accepted), Federated Logic Conference (FLoC): workshop proceedings (2018)
[23] van der Hallen, M., Paramonov, S., Leuschel, M., Janssens, G.: Knowledge representation analysis of graph mining. CoRR 1608.08956 (2016)
[24] Immerman, N.: Descriptive complexity and model checking. In: Arvind, V., Ramanujam, R. (eds.) Foundations of Software Technology and Theoretical Computer Science, 18th Conference, Chennai, India, December 17-19, 1998, Proceedings, Lecture Notes in Computer Science, vol. 1530, pp 1-5. Springer (1998), https://doi.org/10.1007/978-3-540-49382-2_1
[25] Järvisalo, M.: Itemset mining as a challenge application for answer set enumeration. Logic Programming and Nonmonotonic Reasoning (LPNMR), 304-310 (2011)
[26] Kaufmann, B.; Leone, N.; Perri, S.; Schaub, T., Grounding and solving in answer set programming, AI Mag., 37, 25-32, (2016)
[27] Kemmar, A.; Lebbah, Y.; Loudni, S.; Boizumault, P.; Charnois, T., Prefix-projection global constraint and top-k approach for sequential pattern mining, Constraints, 22, 265-306, (2017) · Zbl 1427.68066
[28] Lamport, L.: Specifying Systems, The TLA+ Language and Tools for Hardware and Software Engineers. Addison-Wesley (2002)
[29] Leuschel, M.; Butler, MJ, ProB: An automated analysis toolset for the B method, STTT, 10, 185-203, (2008)
[30] Li, H.; Yap, CW; Ung, CY; Xue, Y.; Cao, ZW; Chen, YZ, Effect of selection of molecular descriptors on the prediction of bloodbrain barrier penetrating and nonpenetrating agents by statistical learning methods, J. Chem. Inf. Model., 45, 1376-1384, (2005)
[31] Lonsing, F.; Biere, A., Depqbf: A dependency-aware QBF solver, JSAT, 7, 71-76, (2010)
[32] Lonsing, F., Egly, U., Gelder, A.V.: Efficient clause learning for quantified boolean formulas via QBF pseudo unit propagation. In: Järvisalo, M., Gelder, A.V. (eds.) Theory and Applications of Satisfiability Testing - SAT 2013 - 16th International Conference, Helsinki, Finland, July 8-12, 2013. Proceedings, Lecture Notes in Computer Science, vol. 7962, pp 100-115. Springer (2013), https://doi.org/10.1007/978-3-642-39071-5_9 · Zbl 1390.68579
[33] McCarthy, J.: Elaboration tolerance. In: Working Papers of the Fourth International Symposium on Logical formalizations of Commonsense Reasoning, Commonsense-1998 (1998)
[34] Muggleton, S.; Raedt, LD, Inductive logic programming: Theory and methods, J. Log. Program., 19, 629-679, (1994) · Zbl 0816.68043
[35] Nijssen, S., Kok, J.N.: Frequent graph mining and its application to molecular databases. In: Proceedings of the IEEE International Conference on Systems, Man & Cybernetics: The Hague, Netherlands, 10-13 October 2004, pp. 4571-4577. IEEE. https://doi.org/10.1109/ICSMC.2004.1401252 (2004)
[36] Paramonov, S., Chen, T., Guns, T.: Generic mining of condensed pattern representations under constraints. In: CEUR: Young Scientist‘s Second International Workshop on Trends in Information Processing Proceedings (YSIP), vol. 1837, pp. 138-177 (2017)
[37] Peitl, T., Slivovsky, F., Szeider, S.: Dependency learning for QBF. In: Gaspers, S., Walsh, T. (eds.) Theory and Applications of Satisfiability Testing - SAT 2017 - 20th International Conference, Melbourne, VIC, Australia, August 28 - September 1, 2017, Proceedings, Lecture Notes in Computer Science, vol. 10491, pp 298-313. Springer (2017), https://doi.org/10.1007/978-3-319-66263-_19 · Zbl 06807233
[38] Rückert, U., Kramer, S.: Optimizing feature sets for structured data. In: Kok, J.N., Koronacki, J., de Mántaras, R.L., Matwin, S., Mladenic, D., Skowron, A. (eds.) Machine Learning: ECML 2007, 18th European Conference on Machine Learning, Warsaw, Poland, September 17-21, 2007, Proceedings, Lecture Notes in Computer Science, vol. 4701, pp 716-723. Springer (2007), https://doi.org/10.1007/978-3-540-74958-5_72
[39] Silva, J.P.M., Sakallah, K.A.: GRASP - a new search algorithm for satisfiability. In: International Conference on Computer-Aided Design (ICCAD), San Jose, California, USA, November 10-14 1996, pp. 220-227 (1996)
[40] Takigawa, I.; Mamitsuka, H., Graph mining: Procedure, application to drug discovery and recent advances, Drug Discov. Today, 18, 50-57, (2013)
[41] Weinzierl, A.: Blending lazy-grounding and CDNL search for answer-set solving. In: Logic Programming and Nonmonotonic Reasoning (LPNMR), Lecture Notes in Computer Science, vol. 10377, pp. 191-204. Springer (2017) · Zbl 06769661
[42] Yan, X., Han, J.: gspan: Graph-based substructure pattern mining. In: Proceedings of the 2002 IEEE International Conference on Data Mining, ICDM ’02, pp 721-. IEEE Computer Society, Washington (2002)
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. It attempts to reflect the references listed in the original paper as accurately as possible without claiming the completeness or perfect precision of the matching.