zbMATH — the first resource for mathematics

Multi-objective evolutionary biclustering of gene expression data. (English) Zbl 1103.68775
Summary: Biclustering or simultaneous clustering of both genes and conditions have generated considerable interest over the past few decades, particularly related to the analysis of high-dimensional gene expression data in information retrieval, knowledge discovery, and data mining. The objective is to find sub-matrices, i.e., maximal subgroups of genes and subgroups of conditions where the genes exhibit highly correlated activities over a range of conditions. Since these two objectives are mutually conflicting, they become suitable candidates for multi-objective modeling. In this study, a novel multi-objective evolutionary biclustering framework is introduced by incorporating local search strategies. A new quantitative measure to evaluate the goodness of the biclusters is developed. The experimental results on benchmark datasets demonstrate better performance as compared to existing algorithms available in literature.

68T10 Pattern recognition, speech recognition
92C40 Biochemistry, molecular biology
Full Text: DOI
[1] Special issue on bioinformatics, IEEE Comput. 35(7) (2002).
[2] Altman, R.B.; Raychaudhuri, S., Whole-genome expression analysis: challenges beyond clustering, Curr. opinion struct. biol., 11, 3, 340-347, (2001)
[3] Tavazoie, S.; Hughes, J.D.; Campbell, M.J.; Cho, R.J.; Church, G.M., Systematic determination of genetic network architecture, Nature genet., 22, 281-285, (1999)
[4] Tou, J.T.; Gonzalez, R.C., Pattern recognition principles, (1974), Addison-Wesley London · Zbl 0299.68058
[5] Mitra, S.; Acharya, T., Data mining: multimedia, soft computing, and bioinformatics, (2003), Wiley New York
[6] Y. Cheng, G.M. Church. Biclustering of gene expression data, in: Proceedings of ISMB 2000, 2000, pp. 93-103.
[7] Hartigan, J.A., Direct clustering of a data matrix, J. am. stat. assoc., 67, 337, 123-129, (1972)
[8] S.Y. Kung, M.-W. Mak, I. Tagkopoulos, Multi-metric and multi-substructure biclustering analysis for gene expression data, Proceedings of the 2005 IEEE Computational Systems Bioinformatics Conference (CSB’05), 2005.
[9] Turner, H.; Bailey, T.; Krzanowski, W., Improved biclustering of microarray data demonstrated through systematic performance tests, Comput. stat. data anal., 48, 2, 235-254, (2005) · Zbl 1429.62267
[10] J. Yang, H. Wang, W. Wang, P. Yu, Enhanced biclustering on expression data, in: Proceedings of the Third IEEE Symposium on Bioinformatics and Bioengineering (BIBE’03), 2003, pp. 1-7.
[11] Lazzeroni, L.; Owen, A., Plaid models for gene expression data, Stat. sin., 12, 61-86, (2002) · Zbl 1004.62084
[12] Tanay, A.; Sharan, R.; Shamir, R., Discovering statistically significant biclusters in gene expression data, Bioinformatics, 18, S136-S144, (2002)
[13] Getz, G.; Gal, H.; Kela, I.; Notterman, D.A.; Domany, E., Coupled two-way clustering analysis of breast cancer and colon cancer gene expression data, Bioinformatics, 19, 1079-1089, (2003)
[14] J. Liu, W. Wang, J. Yang, Gene ontology friendly biclustering of expression profiles, in: Proceedings of the 2004 Computational Systems Bioinformatics Conference (CSB 2004), 2004, pp. 436-447.
[15] A H. Tewfik, A.B. Tchagang, Biclustering of DNA microarray data with early pruning, in: Proceedings of ICASSP 2005, 2005, pp. V773-V776.
[16] Segal, E.; Taskar, B.; Gasch, A.; Friedman, N.; Koller, D., Rich probabilistic models for gene expression, Bioinformatics, 17, S243-S252, (2001)
[17] Z. Zhang, A. Teo, B.C. Ooi, K.-L. Tan, Mining deterministic biclusters in gene expression data, in: Proceedings of the Fourth IEEE Symposium on Bioinformatics and Bioengineering (BIBE’04), 2004, pp. 283-292.
[18] Madeira, S.C.; Oliveira, A.L., Biclustering algorithms for biological data analysis: a survey, IEEE trans. comput. biol. bioinformatics, 1, 24-45, (2004)
[19] J. Liu, J. Yang, W. Wang, Biclustering in gene expression data by tendency, in: Proceedings of the 2004 Computational Systems Bioinformatics Conference (CSB 2004), 2004, pp. 1-12.
[20] Y. Zhang, H. Zha, C.H. Chu, A time-series biclustering algorithm for revealing co-regulated genes, in: Proceedings of the International Conference on Information Technology: Coding and Computing (ITCC’05), 2005, pp. 1-6.
[21] Goldberg, D.E., Genetic algorithms in search, optimization and machine learning, (1989), Addison-Wesley Reading, MA · Zbl 0721.68056
[22] S. Bleuler, A. Prelić, E. Zitzler, An EA framework for biclustering of gene expression data, in: Proceedings of Congress on Evolutionary Computation, 2004, pp. 166-173.
[23] K. Bryan, P. Cunningham, N. Bolshakova, Biclustering of expression data using simulated annealing, in: 18th IEEE Symposium on Computer-Based Medical Systems (CBMS 2005), 2005, pp. 383-388.
[24] Deb, K., Multi-objective optimization using evolutionary algorithms, (2001), Wiley London · Zbl 0970.90091
[25] M. Banerjee, S. Mitra, H. Banka, Evolutionary-rough feature selection in gene expression data, IEEE Trans. Syst. Man Cybern. Part C: Appl. Rev., 2006, to appear.
[26] Peeters, R., The maximum edge biclique problem is NP-complete, Discrete appl. math., 131, 651-654, (2003) · Zbl 1026.68068
[27] Deb, K.; Agarwal, S.; Pratap, A.; Meyarivan, T., A fast and elitist multi-objective genetic algorithm: NSGA-II, IEEE trans. evol. comput., 6, 182-197, (2002)
[28] H. Cho, I.S. Dhilon, Y. Guan, S. Sra, Minimum sum-squared residue co-clustering of gene expression data, in: Proceedings of Fourth SIAM International Conference on Data Mining, 2004.
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. It attempts to reflect the references listed in the original paper as accurately as possible without claiming the completeness or perfect precision of the matching.