×

Topological machine learning with persistence indicator functions. (English) Zbl 1465.68230

Carr, Hamish (ed.) et al., Topological methods in data analysis and visualization V. Theory, algorithms, and applications. Selected papers based on the presentations at the TopoInVis workshop, Tokyo, Japan, February 27–28, 2017. Cham: Springer. Math. Vis., 87-101 (2020).
Summary: Techniques from computational topology, in particular persistent homology, are becoming increasingly relevant for data analysis. Their stable metrics permit the use of many distance-based data analysis methods, such as multidimensional scaling, while providing a firm theoretical ground. Many modern machine learning algorithms, however, are based on kernels. This paper presents persistence indicator functions(PIFs), which summarize persistence diagrams, i.e., feature descriptors in topological data analysis. PIFs can be calculated and compared in linear time and have many beneficial properties, such as the availability of a kernel-based similarity measure. We demonstrate their usage in common data analysis scenarios, such as confidence set estimation and classification of complex structured data.
For the entire collection see [Zbl 1453.68012].

MSC:

68T05 Learning and adaptive systems in artificial intelligence
55N31 Persistent homology and applications, topological data analysis
PDFBibTeX XMLCite
Full Text: DOI arXiv

References:

[1] Adams, H., Emerson, T., Kirby, M., Neville, R., Peterson, C., Shipman, P., Chepushtanova, S., Hanson, E., Motta, F., Ziegelmeier, L.: Persistence images: a stable vector representation of persistent homology. J. Mach. Learn. Res. 18(8), 1-35 (2017) · Zbl 1431.68105
[2] Agarwal, P.K., Edelsbrunner, H., Harer, J., Wang, Y.: Extreme elevation on a 2-manifold. Discr. Comput. Geom. 36(4), 553-572 (2006) · Zbl 1105.52010 · doi:10.1007/s00454-006-1265-8
[3] Bobrowski, O., Kahle, M.: Topology of random geometric complexes: a survey (2014). https://arxiv.org/abs/1409.4734 · Zbl 1402.60015
[4] Bremer, P.T., Edelsbrunner, H., Hamann, B., Pascucci, V.: A topological hierarchy for functions on triangulated surfaces. IEEE Trans. Vis. Comput. Graph. 10(4), 385-396 (2004). https://doi.org/10.1109/TVCG.2004.3 · doi:10.1109/TVCG.2004.3
[5] Bubenik, P.: Statistical topological data analysis using persistence landscapes. J. Mach. Learn. Res. 16, 77-102 (2015) · Zbl 1337.68221
[6] Chazal, F., Fasy, B.T., Lecci, F., Rinaldo, A., Singh, A., Wasserman, L.: On the bootstrap for persistence diagrams and landscapes. Model. Anal. Inf. Syst. 20(6), 111-120 (2013) · doi:10.18255/1818-1015-2013-6-111-120
[7] Cohen-Steiner, D., Edelsbrunner, H., Harer, J.: Stability of persistence diagrams. Discr. Comput. Geom. 37(1), 103-120 (2007) · Zbl 1117.54027 · doi:10.1007/s00454-006-1276-5
[8] Cohen-Steiner, D., Edelsbrunner, H., Harer, J., Mileyko, Y.: Lipschitz functions have \(L_{}\) p-stable persistence. Found. Comput. Math. 10(2), 127-139 (2010) · Zbl 1192.55007 · doi:10.1007/s10208-010-9060-6
[9] Edelsbrunner, H., Harer, J.: Computational Topology: An Introduction. AMS, New York (2010) · Zbl 1193.55001
[10] Edelsbrunner, H., Morozov, D.: Persistent homology: theory and practice. In: European Congress of Mathematics. EMS Publishing House, Zürich (2014)
[11] Edelsbrunner, H., Letscher, D., Zomorodian, A.J.: Topological persistence and simplification. Discr. Comput. Geom. 28(4), 511-533 (2002) · Zbl 1011.68152 · doi:10.1007/s00454-002-2885-2
[12] Efron, B., Tibshirani, R.J.: An Introduction to the Bootstrap. Monographs on Statistics and Applied Probability, vol. 57 . Chapman & Hall/CRC, Boca Raton, FL (1993) · Zbl 0835.62038
[13] Günther, D., Boto, R.A., Contreras-Garcia, J., Piquemal, J.P., Tierny, J.: Characterizing molecular interactions in chemical systems. IEEE Trans. Vis. Comp. Graph. 20(12), 2476-2485 (2014) · doi:10.1109/TVCG.2014.2346403
[14] Kerber, M., Morozov, D., Nigmetov, A.: Geometry helps to compare persistence diagrams. In: Goodrich, M., Mitzenmacher, M. (eds.) Proceedings of the 18th Workshop on Algorithm Engineering and Experiments (ALENEX), pp. 103-112. SIAM, Philadelphia, PA (2016) · Zbl 1430.68376
[15] Kohavi, R.: A study of cross-validation and bootstrap for accuracy estimation and model selection. In: Proceedings of the IJCAI, vol. 2, pp. 1137-1143 (1995)
[16] Kosorok, M.R.: Introduction to Empirical Processes and Semiparametric Inference. Springer, New York, NY (2008) · Zbl 1180.62137 · doi:10.1007/978-0-387-74978-5
[17] Laney, D., Bremer, P.T., Mascarenhas, A., Miller, P., Pascucci, V.: Understanding the structure of the turbulent mixing layer in hydrodynamic instabilities. IEEE Trans. Vis. Comput. Graph. 12(5), 1053-1060 (2006) · doi:10.1109/TVCG.2006.186
[18] Mucha, M., Sankowski, P.: Maximum matchings via Gaussian elimination. In: 45th Annual IEEE Symposium on Foundations of Computer Science, pp. 248-255 (2004) · Zbl 1111.05304
[19] Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., Thirion, B., Grisel, O., Blondel, M., Prettenhofer, P., Weiss, R., Dubourg, V., Vanderplas, J., Passos, A., Cournapeau, D., Brucher, M., Perrot, M., Duchesnay, É.: Scikit-learn: machine learning in Python. J. Mach. Learn. Res. 12, 2825-2830 (2011) · Zbl 1280.68189
[20] Reininghaus, J., Huber, S., Bauer, U., Kwitt, R.: A stable multi-scale kernel for topological machine learning. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 4741-4748. Curran Associates, Inc., Red Hook, NY (2015)
[21] Rieck, B., Leitte, H.: Shall I compare thee to a network?—Visualizing the topological structure of Shakespeare’s plays. In: Workshop on Visualization for the Digital Humanities at IEEE VIS. Baltimore, MD (2016)
[22] Rieck, B., Fugacci, U., Lukasczyk, J., Leitte, H.: Clique community persistence: a topological visual analysis approach for complex networks. IEEE Trans. Vis. Comput. Graph. 22(1), 822-831 (2018). https://doi.org/10.1109/TVCG.2017.2744321 · doi:10.1109/TVCG.2017.2744321
[23] Schölkopf, B., Smola, A.J.: Learning with Kernels. The MIT Press, Cambridge, MA (2002) · Zbl 1019.68094
[24] Schölkopf, B., Smola, A.J., Müller, K.R.: Nonlinear component analysis as a kernel eigenvalue problem. Neural Comput. 10(5), 1299-1319 (1998) · doi:10.1162/089976698300017467
[25] Sugiyama, M., Ghisu, M.E., Llinares-López, F., Borgwardt, K.: graphkernels: R and Python packages for graph comparison. Bioinformatics 34(3), 530-532 (2017)
[26] Turner, K., Mileyko, Y., Mukherjee, S., Harer, J.: Fréchet means for distributions of persistence diagrams. Discr. Comput. Geom. 52(1), 44-70 (2014) · Zbl 1296.68182 · doi:10.1007/s00454-014-9604-7
[27] Yanardag, P.
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.