×

RSC-based differential model with correlation removal for improving multi-omics clustering. (English) Zbl 1504.92057

Summary: Multi-omics clustering plays an important role in cancer subtyping. However, the data of different kinds of omics are often related, these correlations may reduce the clustering algorithm performance. It is crucial to eliminate the unexpected redundant information caused by these correlations between different omics. We proposed RSC-based differential model with correlation removal for improving multi-omics clustering (RSC-MCR). This method first introduced RSC to calculate the pairwise correlations of all features, and decomposed it to obtain the pairwise correlations of different omics features, thus built the connection between different omics based on the pairwise correlations of different omics features. Then, to remove the redundant correlation, we designed a differential model to calculate the degree of difference between the original feature matrix and the correlation matrix which contained the most relevant information between different omics. We compared the performance of RSC-MCR with decorrelation methods on different clustering methods (CC, FCM, SNF, NMF, LRAcluster). The experimental results on five cancer datasets show the efficiency of the RSC-MCR as well as improvements over other decorrelation methods.

MSC:

92C50 Medical applications (general)
62P10 Applications of statistics to biology and medical sciences; meta analysis
PDFBibTeX XMLCite
Full Text: DOI

References:

[1] Abdullah, M. B., On a robust correlation coefficient, J. R. Stat. Soc.: Ser. D (Statistician), 39, 4, 455-460 (1990)
[2] Bezdek, J. C.; Ehrlich, R.; Full, W., FCM: The fuzzy c-means clustering algorithm, Comput. Geosci., 10, 2-3, 191-203 (1984)
[3] Bickel, D. R., Robust cluster analysis of microarray gene expression data with the number of clusters determined biologically, Bioinformatics, 19, 7, 818-824 (2003)
[4] Bro, R.; Smilde, A. K., Principal component analysis, Anal. Methods, 6, 2812-2831 (2014)
[5] Bunte, K.; Leppäaho, E.; Saarinen, I.; Kaski, S., Sparse group factor analysis for biclustering of multiple data sources, Bioinformatics, 32, 16, 2457-2463 (2016)
[6] Coretto, P.; Serra, A.; Tagliaferri, R., Robust clustering of noisy high-dimensional. gene expression data for patients subtyping, Bioinformatics, 34, 23, 4064-4072 (2018)
[7] Davies, D. L.; Bouldin, D. W., A cluster separation measure, IEEE Trans. Pattern Anal. Mach. Intell., 1, 2, 224-227 (1979)
[8] Fang, S.; Ma, Y. C.; Li, Z.; Zhang, B., A visual tracking algorithm via confidence-based multi-feature. correlation filtering, Multimedia Tools Appl., 80, 1, 23963-23982 (2021)
[9] Gnanadesikan, R.; Kettenring, J. R., Robust estimates, residuals, and outlier detection with multiresponse data, Biometrics, 28, 1, 81-124 (1972)
[10] Hardin, J.; Mitani, A.; Hicks, L.; VanKoten, B., A robust measure of correlation between two genes on a microarray, BMC Bioinf., 8, 220 (2007)
[11] Huang, H. P., Mechanisms of dimensionality reduction and decorrelation in deep neural networks, Phys. Rev. E, 98, 6, 9-20 (2018)
[12] Kendall, M. G., A new measure of rank correlation, Biometrika, 30, 1/2, 81-93 (1938) · Zbl 0019.13001
[13] Kessy, A.; Lewin, A.; Strimmer, K., Optimal whitening and decorrelation, Am. Stat., 72, 4, 309-314 (2018) · Zbl 07663954
[14] Klami, A.; Virtanen, S.; Leppaaho, E.; Kaski, S., Group factor analysis, IEEE Trans. Neural Networks Learn. Syst., 26, 9, 2136-2147 (2015)
[15] Lee, D. D.; Seung, H. S., Learning the parts of objects by non-negative matrix factorization, Nature, 401, 6755, 788-791 (1999) · Zbl 1369.68285
[16] Li, Z. C.; Tang, J. H.; He, X. F., Robust structured nonnegative matrix factorization for image representation, IEEE Trans. Neural Networks Learn. Syst., 29, 1947-1960 (2018)
[17] Li, Y. R.; Zhao, Q. H.; Luo, K. P., Multi-objective soft subspace clustering in the composite kernel space, Inf. Sci., 563, 23-39 (2021) · Zbl 1527.68186
[18] Liu, H.; Wang, J. K.; Guo, D. M.; Fu, Y. Q.; Chen, S.; Liu, S.; Dan, G., Robust subspace clustering based on inter-cluster correlation reduction by low rank representation, Signal Process. Image Commun., 93, 116137-116148 (2021)
[19] Ma, Z. Y.; Xue, J. H.; Leijon, A.; Tan, Z. H.; Yang, Z.; Guo, J., Decorrelation of neutral vector variables: theory and applications, IEEE Trans. Neural Networks Learn. Syst., 29, 1, 129-143 (2018)
[20] Mu, Y. S.; Liu, X. D.; Wang, L. D., A Pearson’s correlation coefficient based decision tree and its parallel implementation, Inf. Sci., 435, 40-58 (2018)
[21] Pfeifer, B.; Schimek, M. G., A hierarchical clustering and data fusion approach for disease subtype discovery, J. Biomed. Inform., 113, c, Article 103636 pp. (2021)
[22] Rappoport, N.; Shamir, R., Multi-omic and multi-view clustering algorithms: review and cancer benchmark, Nucleic Acids Res., 46, 20, 10546-10562 (2018)
[23] Rousseeuw, P.J., 1987. Silhouettes: A graphical aid to the interpretation and validation of cluster. · Zbl 0636.62059
[24] Serra, A.; Coretto, P.; Fratello, M.; Tagliaferri, R.; Stegle, O., Robust and sparse correlation matrix estimation for the analysis of high-dimensional genomics data, Bioinformatics, 34, 4, 625-634 (2018)
[25] Sharma, A.; Gandhi, A.; Kumar, A., Estimation of optical model parameters and their correlation matrix using Unscented Transform Kalman Filter technique, Phys. Lett. B, 815, 13619-13625 (2021)
[26] Spearman, C., “General intelligence,” objectively determined and measured, Am. J. Psychol., 15, 2, 201-293 (1904)
[27] Su, S. Z.; Fang, X. J.; Yang, G. M.; Ge, B.; Zheng, P., Clustering adaptive canonical correlations for high-dimensional multi-modal data, J. Vis. Commun. Image Represent., 71, 102815-102824 (2020)
[28] Wang, X.; Fan, S. H.; Kuang, K.; Shi, C.; Liu, J. W.; Wang, B., Decorrelated clustering with data selection bias, IJCAI2020. (2020)
[29] Wang, B.; Mezlini, A. M.; Demir, F.; Fiume, M.; Tu, Z.; Brudno, M.; Haibe-Kains, B.; Goldenberg, A., Similarity network fusion for aggregating data types on a genomic scale, Nat. Methods, 11, 3, 333-337 (2014)
[30] Wilkerson, M. D.; Hayes, D. N., ConsensusClusterPlus: a class discovery tool with confidence assessments and item tracking, Bioinformatics, 26, 12, 1572-1573 (2010)
[31] Wu, D. M.; Wang, D. F.; Zhang, M. Q.; Gu, J., Fast dimension reduction and integrative clustering of multi-omics data using low-rank approximation: application to cancer molecular classification, BMC Genomics, 16, 1022 (2015)
[32] Xu, H. H.; Deng, Y., Dependent evidence combination based on shearman coefficient and Pearson coefficient, IEEE Access, 6, 11634-11640 (2018)
[33] Yang, Y.; Ma, Z. G.; Yang, Y.; Nie, F. P.; Shen, H. T., Multitask spectral clustering by exploring intertask correlation, IEEE Trans. Cybern., 45, 1069-1080 (2015)
[34] Yao, S. Y.; Hu, C. L.; Wang, T.; Cui, X. Y., Autoencoder-like semi-NMF multiple clustering, Inf. Sci., 572, 331-342 (2021)
[35] Zhou, H.; Yin, H. P.; Li, Y. X.; Chai, Y., Multiview clustering via exclusive non-negative subspace learning and constraint propagation, Inf. Sci., 552, 102-117 (2021) · Zbl 1484.62087
[36] Zong, L. L.; Zhang, X. C.; Liu, X. Y., Multi-view clustering on unmapped data via constrained non-negative matrix factorization, Neural Networks, 108, 155-171 (2018) · Zbl 1434.68492
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.