×

zbMATH — the first resource for mathematics

A new mathematical modeling for pure parsimony haplotyping problem. (English) Zbl 1348.92108
Summary: Pure parsimony haplotyping (PPH) problem is important in bioinformatics because rational haplotyping inference plays important roles in analysis of genetic data, mapping complex genetic diseases such as Alzheimer’s disease, heart disorders and etc. Haplotypes and genotypes are \(m\)-length sequences. Although several integer programing models have already been presented for PPH problem, its NP-hardness characteristic resulted in ineffectiveness of those models facing the real instances especially instances with many heterozygous sites. In this paper, we assign a corresponding number to each haplotype and genotype and based on those numbers, we set a mixed integer programing model. Using numbers, instead of sequences, would lead to less complexity of the new model in comparison with previous models in a way that there are neither constraints nor variables corresponding to heterozygous nucleotide sites in it. Experimental results approve the efficiency of the new model in producing better solution in comparison to two state-of-the art haplotyping approaches.
MSC:
92D10 Genetics and epigenetics
92C50 Medical applications (general)
90C11 Mixed integer programming
Software:
GERBIL
PDF BibTeX XML Cite
Full Text: DOI
References:
[1] Altshuler, D., The common PPAR γ pro12ala polymorphism is associated with decreased risk of type 2 diabetes, Nat. Genet., 26, 76-80, (2000)
[2] Bell, G. I.; Horita, S.; Karam, J. H., A polymorphic locus near the human insulin gene is associated with insulindependent diabetes mellitus, Diabetes, 33, 176-183, (1984)
[3] Bertolazzi, P.; Godi, A.; Labbé, M.; Tininini, L., Solving haplotyping inference parsimony problem using a new basic polynomial formulation, Comput. Math. Appl., 55, 5, 900-911, (2008) · Zbl 1137.92022
[4] Brown, D.; Harrower, I. M., Integer programming approaches to haplotype inference by pure parsimony, IEEE Trans. Comput. Biol. Bioinf., 3, 2, 141-154, (2006)
[5] Cargill, M., Characterization of single-nucleotide polymorphisms in coding regions of human genes, Nat. Genet., 22, 231-238, (1999)
[6] Catanzaro, D.; Godi, A.; Labbé, M., A class representative model for pure parsimony haplotyping, (2007), G.O.M. - Computer Science Department - Universite´ Libre de Bruxelles (U.L.B.), Technical Report · Zbl 1243.90134
[7] Catanzaro, D.; Labbé, M., The pure parsimony haplotyping problem: overview and computational advances, Int. Trans. Oper. Res., 16, 5, 561-584, (2009) · Zbl 1187.92068
[8] Chapuis, J.; Hot, D.; Hansmannel, F.; Kerdraon, O.; Ferreira, S.; Hubans, C.; Ayral, A. M., Transcriptomic and genetic studies identify IL-33 as a candidate gene for Alzheimer’s disease, Mol. Psychiatry, 14, 11, 1004-1016, (2009)
[9] Choi, M. H.; Kang, S. H.; Lim, H. S., An improved preprocessing algorithm for haplotype inference by pure parsimony, J. Bioinf. Comput. Biol., 12, 04, (2014)
[10] Deeb, S. S., A pro12ala substitution in PPAR g 2 associated with decreased receptor activity, lower body mass index and improved insulin sensitivity, Nat. Genet., 20, 284-287, (1998)
[11] Di Gaspero, L.; Roli, A., Stochastic local search for large-scale instances of the haplotype inference problem by pure parsimony, J. Algorithms, 63, 3, 55-69, (2008) · Zbl 1151.68389
[12] Dorman, S. J., Worldwide differences in the incidence of type I diabetes are associated with amino acid variation at position 57 of the HLA-DQ b chain, Proc. Natl. Acad. Sci. USA, 87, 7370-7374, (1990)
[13] Godi, A.; Tininini, L.; Bertolazzi, P., Haplotype inference by parsimony for large datasets, (2004), IASI, Istituto di Analisi dei Sistemi ed Informatica - CNR Rome, Technical Report 616
[14] Gretarsdottir, S., The gene encoding phosphodiesterase 4D confers risk of ischemic stroke, Nat. Genet., 35, 131-138, (2003)
[15] Gusfield, D., Inference of haplotypes from samples of diploid populations: complexity and algorithms, J. Comput. Biol., 8, 305-324, (2001)
[16] Gusfield, D., Haplotype inference by pure parsimony, (Baeza-Yates, R.; Chávez, E.; rochemore, M., Berlin, Proceedings of the Annual Symposium in Combinatorial Pattern Matching, vol. 2676, (2003), Lecture Note in Computer Science, Springer-Verlag), 144-155
[17] Halldórsson, B. V.; Bafna, V.; Edwards, N.; Lippert, R., Combinatorial problems arising in SNP and haplotype analysis, (Calude, C. S., Discrete Mathematics and Theoretical Computer Science, vol. 2731 Lecture Note in Computer Science, (2003), Springer-Verlag Berlin), 26-47 · Zbl 1040.92026
[18] Halldórsson, B. V.; Bafna, V.; Edwards, N.; Lippert, R.; Calude, C. S., Computational problems arising in SNP and haplotype analysis, Discrete Mathematics and Theoretical Computer Science, Vol. 2731 of Lecture Note in Computer Science, 26-47, (2003), Springer-Verlag Berlin · Zbl 1040.92026
[19] Haluska, M. K., Patterns of single nucleotide polymorphisms in candidate genes of blood pressure homeostasis, Nat. Genet., 22, 239-247, (1999)
[20] Huang, Y. T.; Chao, K. M.; Chen, T., An approximation algorithm for haplotype inference by maximum parsimony, J. Comput. Biol., 12, 10, 1261-1274, (2005)
[21] Hudson, R. R., Generating samples under a wright-Fisher neutral model of genetic variation, Bioinformatics, 18, 337-338, (2002)
[22] Kalpakis, K.; Namjoshi, P., Haplotype phasing using semidefinite programming, (Kapyris, G.; Bourbakis, N.; Tsai, J. J., Proceedings of the IEEE Computer Society on Bioinformatics and Bioengineering (BIBE), Minneapolis, MN, (2005)), 145-152
[23] Kimmel, G.; Shamir, R., GERBIL: genotype resolution and block identification using likelihood, Proc. Natl. Acad. Sci. USA, 102, 1, 158-162, (2005)
[24] Lancia, G., The phasing of heterozygous traits: algorithms and complexity, Comput. Math. Appl., 55, 960-969, (2008) · Zbl 1137.92024
[25] Lancia, G.; Rizzi, R., A polynomial case of the parsimony haplotyping problem, Oper. Res. Lett., 34, 3, 289-295, (2006) · Zbl 1092.92018
[26] Lancia, G.; Serafini, P., A set covering approach with column generation for parsimony haplotyping, INFORMS J. Comput., 21, 151-166, (2008) · Zbl 1243.90140
[27] Lancia, G.; Pinotti, M. C.; Rizzi, R., Haplotyping populations by pure parsimony: complexity of exact and approximate algorithms, INFORMS J. Comput., 16, 4, 348-359, (2004) · Zbl 1239.90076
[28] Li, W. H.; Sadler, L. A., Low nucleotide diversity in man, Genetics, 129, 513-523, (1991)
[29] Li, Z. P.; Zhou, W. F.; Zhang, X. S.; Chen, L., A parsimonious tree-grow method for haplotype inference, Bioinformatics, 21, 17, 3475-3481, (2005)
[30] Nistico´, L., The ctla-4 gene region of chromosome 2q33 is linked to, and associated with, type I diabetes, Hum. Mol. Genet., 5, 1075-1080, (1996)
[31] Pan, W.; Zhao, Y.; Xu, Y.; Zhou, F., Winhap2: an extremely fast haplotype phasing program for long genotype sequences, BMC Bioinf., 15, 1, 1, (2014)
[32] Risch, N.; Merikangas, K., The future of genetic studies of complex human diseases, Science, 273, 1516-1517, (1996)
[33] Stephens, M.; Donnelly, P., A comparison of Bayesian methods for haplotype reconstruction from population genotype data, Am. J. Hum. Genet., 73, 5, 1162-1169, (2003)
[34] Strittmatter, W. J.; Roses, A. D., Apolipoprotein E and Alzheimer’s disease, Ann. Rev. Neurosci., 19, 53-77, (1996)
[35] Trégouët, D. A.; König, I. R.; Erdmann, J.; Munteanu, A.; Braund, P. S.; Hall, A. S.; Meitinger, T., Genome-wide haplotype association study identifies the SLC22A3-LPAL2-LPA gene cluster as a risk locus for coronary artery disease, Nat. Genet., 41, 3, 283-285, (2009)
[36] Van Eerdewegh, P., Association of the ADAM33 gene with asthma and bronchial hyperresponsiveness, Nature, 418, 426-430, (2002)
[37] Wang, D. G., Large-scale identification, mapping, and genotyping of single nucleotide polymorphisms in the human genome, Science, 280, 1077-1082, (1998)
[38] Wang, L.; Xu, Y., Haplotype inference by maximum parsimony, Bioinformatics, 19, 14, 1773-1780, (2003)
[39] Wang, R. S.; Zhang, X. S.; Sheng, L., Haplotype inference by pure parsimony via genetic algorithm, Operations Research and Its Applications, Lecture Notes in Operations Research, vol. 5, 308-318, (2005), World Publishing Corporation
[40] Wei, B.; Zhao, J., Haplotype inference using a novel binary particle swarm optimization algorithm, Appl. Soft Comput., 21, 415-422, (2014)
[41] Zhang, X. S.; Wang, R. S.; Wu, L. Y.; Chen, L., Models and algorithms for haplotyping problem, Curr. Bioinf., 1, 105-114, (2006)
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. It attempts to reflect the references listed in the original paper as accurately as possible without claiming the completeness or perfect precision of the matching.