Atom-specific persistent homology and its application to protein flexibility analysis.

*(English)*Zbl 1439.92081Summary: Recently, persistent homology has had tremendous success in biomolecular data analysis. It works by examining the topological relationship or connectivity of a group of atoms in a molecule at a variety of scales, then rendering a family of topological representations of the molecule. However, persistent homology is rarely employed for the analysis of atomic properties, such as biomolecular flexibility analysis or B-factor prediction. This work introduces atom-specific persistent homology to provide a local atomic level representation of a molecule via a global topological tool. This is achieved through the construction of a pair of conjugated sets of atoms and corresponding conjugated simplicial complexes, as well as conjugated topological spaces. The difference between the topological invariants of the pair of conjugated sets is measured by Bottleneck and Wasserstein metrics and leads to an atom-specific topological representation of individual atomic properties in a molecule. Atom-specific topological features are integrated with various machine learning algorithms, including gradient boosting trees and convolutional neural network for protein thermal fluctuation analysis and B-factor prediction. Extensive numerical results indicate the proposed method provides a powerful topological tool for analyzing and predicting localized information in complex macromolecules.

##### MSC:

92C40 | Biochemistry, molecular biology |

55N99 | Homology and cohomology theories in algebraic topology |

##### Keywords:

atom-specific topology; element-specific persistent homology; protein flexibility; gradient boosting tree; convolutional neural network
PDF
BibTeX
XML
Cite

\textit{D. Bramer} and \textit{G.-W. Wei}, Comput. Math. Biophys. 8, No. 1, 1--35 (2020; Zbl 1439.92081)

Full Text:
DOI

##### References:

[1] | K. L. Xia and G. W. Wei. Persistent homology analysis of protein structure, flexibility and folding. International Journal for Numerical Methods in Biomedical Engineering, 30:814-844, 2014. |

[2] | M. Gameiro, Y. Hiraoka, S. Izumi, M. Kramar, K. Mischaikow, and V. Nanda. Topological measurement of protein compressibility via persistence diagrams. Japan Journal of Industrial and Applied Mathematics, 32:1-17, 2014. · Zbl 1320.55004 |

[3] | K. L. Xia and G. W. Wei. Persistent topology for cryo-EM data analysis. International Journal for Numerical Methods in Biomedical Engineering, 31:e02719, 2015. |

[4] | Z. X. Cang, Lin Mu, Kedi Wu, Kris Opron, Kelin Xia, and Guo-Wei Wei. A topological approach to protein classification. Molecular based Mathematical Biology, 3:140-162, 2015. · Zbl 1347.92054 |

[5] | Violeta Kovacev-Nikolic, Peter Bubenik, Dragan Nikolić, and Giseon Heo. Using persistent homology and dynamical distances to analyze protein binding. Stat. Appl. Genet. Mol. Biol., 15(1):19-38, 2016. · Zbl 1343.92380 |

[6] | Kelin Xia. Persistent homology analysis of ion aggregations and hydrogen-bonding networks. Physical Chemistry Chemical Physics, 20(19):13448-13460, 2018. |

[7] | Patrizio Frosini and Claudia Landi. Size theory as a topological tool for computer vision. Pattern Recognition and Image Analysis, 9(4):596-603, 1999. · Zbl 0995.68092 |

[8] | H. Edelsbrunner, D. Letscher, and A. Zomorodian. Topological persistence and simplification. Discrete Comput. Geom., 28:511-533, 2002. · Zbl 1011.68152 |

[9] | A. Zomorodian and G. Carlsson. Computing persistent homology. Discrete Comput. Geom., 33:249-274, 2005. · Zbl 1069.55003 |

[10] | Afra Zomorodian and Gunnar Carlsson. Localized homology. Computational Geometry - Theory and Applications, 41(3):126-148, 2008. · Zbl 1155.65021 |

[11] | Yuan Yao, Jian Sun, Xuhui Huang, Gregory R Bowman, Gurjeet Singh, Michael Lesnick, Leonidas J Guibas, Vijay S Pande, and Gunnar Carlsson. Topological methods for exploring low-density states in biomolecular folding pathways. The Journal of chemical physics, 130(14):04B614, 2009. |

[12] | Z. X. Cang and G. W. Wei. Analysis and prediction of protein folding energy changes upon mutation by element specific persistent homology. Bioinformatics, 33:3549-3557, 2017. |

[13] | Z. X. Cang and G. W. Wei. Integration of element specific persistent homology and machine learning for protein-ligand binding affinity prediction. International Journal for Numerical Methods in Biomedical Engineering, 34(2):e2914, DOI: 10.1002/cnm.2914, 2018. |

[14] | David Cohen-Steiner, Herbert Edelsbrunner, John Harer, and Yuriy Mileyko. Lipschitz functions have L_p-stable persistence. Foundations of computational mathematics, 10(2):127-139, 2010. · Zbl 1192.55007 |

[15] | David Cohen-Steiner, Herbert Edelsbrunner, and John Harer. Stability of persistence diagrams. Discrete & Computational Geometry, 37(1):103-120, 2007. · Zbl 1117.54027 |

[16] | Z. X. Cang and G. W. Wei. TopologyNet: Topology based deep convolutional and multi-task neural networks for biomolecular property predictions. PLOS Computational Biology, 13(7):e1005690, doi:10.1371/journal.pcbi.1005690, 2017. |

[17] | Kedi Wu and G. W. Wei. Quantitative Toxicity Prediction Using Topology Based Multitask Deep Neural Networks. Journal of Chemical Information and Modeling, 58:520-531, 2018. |

[18] | Kedi Wu, Zhixiong Zhao, Renxiao Wang, and G. W. Wei. TopP-S: Persistent Homology-Based Multi-Task Deep Neural Networks for Simultaneous Predictions of Partition Coefficient and Aqueous Solubility. Journal of Computational Chemistry, 39:1444-1454, 2018. |

[19] | Z. X. Cang, L. Mu, and G. W. Wei. Representability of algebraic topology for biomolecules in machine learning based scoring and virtual screening. PLOS Computational Biology, 14(1):e1005929, doi:10.1371/journal.pcbi.1005929, 2018. |

[20] | Guowei Wei, Duc Nguyen, and Zixuan Cang. System and methods for machine learning for drug design and discovery, October 3 2019. US Patent App. 16/372,239. |

[21] | J. P. Ma. Usefulness and limitations of normal mode analysis in modeling dynamics of biomolecular complexes. Structure, 13:373 - 180, 2005. |

[22] | H. Frauenfelder, S. G. Slihar, and P. G. Wolynes. The energy landsapes and motion of proteins. Science, 254(5038):1598-1603, DEC 13 1991. |

[23] | M. Tasumi, H. Takenchi, S. Ataka, A. M. Dwidedi, and S. Krimm. Normal vibrations of proteins: Glucagon. Biopolymers, 21:711 - 714, 1982. |

[24] | B. R. Brooks, R. E. Bruccoleri, B. D. Olafson, D.J. States, S. Swaminathan, and M. Karplus. Charmm: A program for macro-molecular energy, minimization, and dynamics calculations. J. Comput. Chem., 4:187-217, 1983. |

[25] | M. Levitt, C. Sander, and P. S. Stern. Protein normal-mode dynamics: Trypsin inhibitor, crambin, ribonuclease and lysozyme. J. Mol. Biol., 181(3):423 - 447, 1985. |

[26] | M. M. Tirion. Large amplitude elastic motions in proteins from a single-parameter, atomic analysis. Phys. Rev. Lett., 77:1905 - 1908, 1996. |

[27] | A. R. Atilgan, S. R. Durrell, R. L. Jernigan, M. C. Demirel, O. Keskin, and I. Bahar. Anisotropy of fluctuation dynamics of proteins with an elastic network model. Biophys. J., 80:505 - 515, 2001. |

[28] | I. Bahar, A. R. Atilgan, and B. Erman. Direct evaluation of thermal fluctuations in proteins using a single-parameter harmonic potential. Folding and Design, 2:173 - 181, 1997. |

[29] | I. Bahar, A. R. Atilgan, M. C. Demirel, and B. Erman. Vibrational dynamics of proteins: Significance of slow and fast modes in relation to function and stability. Phys. Rev. Lett, 80:2733 - 2736, 1998. |

[30] | Turkan Haliloglu, Ivet Bahar, and Burak Erman. Gaussian dynamics of folded proteins. Physical review letters, 79(16):3090, 1997. |

[31] | K. L. Xia and G. W. Wei. A stochastic model for protein flexibility analysis. Physical Review E, 88:062709, 2013. |

[32] | K. Opron, K. L. Xia, and G. W. Wei. Fast and anisotropic flexibility-rigidity index for protein flexibility and fluctuation analysis. Journal of Chemical Physics, 140:234105, 2014. |

[33] | Kristopher Opron, K. L. Xia, and G. W. Wei. Communication: Capturing protein multiscale thermal fluctuations. Journal of Chemical Physics, 142(211101), 2015. |

[34] | David Bramer and G. W. Wei. Weighted multiscale colored graphs for protein flexibility and rigidity analysis. Journal of Chemical Physics, 148:054103, 2018. |

[35] | David Bramer and G. W. Wei. Blind prediction of protein B-factor and flexibility. Journal of Chemical Physics, 149:021837, 2018. |

[36] | K. L. Xia and G. W. Wei. Multidimensional persistence in biomolecular data. Journal of Computational Chemistry, 36:1502-1520, 2015. |

[37] | Brittany Terese Fasy, Jisu Kim, Fabrizio Lecci, and Clément Maria. Introduction to the r package tda. arXiv preprint arXiv:1411.1830, 2014. |

[38] | Matthias Heinig and Dmitrij Frishman. Stride: a web server for secondary structure assignment from known atomic coordinates of proteins. Nucleic acids research, 32(suppl_2):W500-W502, 2004. |

[39] | J. K. Park, Robert Jernigan, and Zhijun Wu. Coarse grained normal mode analysis vs. refined gaussian network model for protein residue-level structural fluctuations. Bulletin of Mathematical Biology, 75:124-160, 2013. · Zbl 1402.92330 |

[40] | N. Go, T. Noguti, and T. Nishikawa. Dynamics of a small globular protein in terms of low-frequency vibrational modes. Proc. Natl. Acad. Sci., 80:3696 - 3700, 1983. |

[41] | B. Brooks and M. Karplus. Harmonic dynamics of proteins: normal modes and fluctuations in bovine pancreatic trypsin inhibitor. Proceedings of the National Academy of Sciences, 80(21):6571-6575, 1983. |

[42] | Kristopher Opron, K. L. Xia, Z. Burton, and G. W. Wei. Flexibility-rigidity index for protein-nucleic acid flexibility and fluctuation analysis. Journal of Computational Chemistry, 37:1283-1295, 2016. |

This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. It attempts to reflect the references listed in the original paper as accurately as possible without claiming the completeness or perfect precision of the matching.