zbMATH — the first resource for mathematics

A study of biases of DNA copy number estimation based on PICR model. (English) Zbl 1271.62271
Summary: Affymetrix single-nucleotide polymorphism (SNP) arrays have been widely used for SNP genotype calling and copy number variation (CNV) studies, both of which are dependent on accurate DNA copy number estimation significantly. However, the methods for copy number estimation may suffer from kinds of difficulties: probe dependent binding affinity, crosshybridization of probes, and the whole genome amplification (WGA) of DNA sequences. The probe intensity composite representation (PICR) model, one former established approach, can cope with most complexities and achieve high accuracy in SNP genotyping. Nevertheless, the copy numbers estimated by PICR model still show array and site dependent biases for CNV studies. In this paper, we propose a procedure to adjust the biases and then make CNV inference based on both PICR model and our method. The comparison indicates that our correction of copy numbers is necessary for CNV studies.
62P10 Applications of statistics to biology and medical sciences; meta analysis
68U01 General topics in computing methodologies
92D20 Protein sequences, DNA sequences
Full Text: DOI
[1] Barnes C, Plagnol V, Fitzgerald T, Redon R, Marchini J, Clayton D, Hurles M E. A robust statistical method for case-control association testing with copy number variation. Nat Genet, 2008, 40(10): 1245–1252 · doi:10.1038/ng.206
[2] Bengtsson H, Irizarry R, Carvalho B, Speed T P. Estimation and assessment of raw copy numbers at the single locus level. Bioinformatics, 2008, 24(6): 759–767 · Zbl 05511572 · doi:10.1093/bioinformatics/btn016
[3] Bengtsson H, Wirapati P, Speed T P. A single-array preprocessing method for estimating full-resolution raw copy numbers from all Affymetrix genotyping arrays including GenomeWideSNP 5 & 6. Bioinformatics, 2009, 25(17): 2149–2156 · Zbl 05744168 · doi:10.1093/bioinformatics/btp371
[4] Bignell G R, Huang J, Greshock J, Watt S, Butler A, West S, Grigorova M, Jones K W, Wei W, Stratton M R, et al. High-resolution analysis of DNA copy number using oligonucleotide microarrays. Genome Res, 2004, 14(2): 287–295 · doi:10.1101/gr.2012304
[5] Carter N P. Methods and strategies for analyzing copy number variation using DNA microarrays. Nat Genet, 2007, 39(7 Suppl): S16–21 · doi:10.1038/ng2028
[6] Di X, Matsuzaki H, Webster T A, Hubbell E, Liu G, Dong S, Bartell D, Huang J, Chiles R, Yang G, et al. Dynamic model based algorithms for screening and genotyping over 100 K SNPs on oligonucleotide microarrays. Bioinformatics, 2005, 21(9): 1958–1963 · doi:10.1093/bioinformatics/bti275
[7] Greenman C D, Bignell G, Butler A, Edkins S, Hinton J, Beare D, Swamy S, Santarius T, Chen L, Widaa S, Futreal P A, Stratton M R. PICNIC: an algorithm to predict absolute allelic copy number variation with microarray cancer data. Biostatistics, 2010, 11(1): 164–175 · doi:10.1093/biostatistics/kxp045
[8] Held G A, Grinstein G, Tu Y. Modeling of DNA microarray data by using physical properties of hybridization. Proc Natl Acad Sci USA, 2003, 100(13): 7575–7580 · doi:10.1073/pnas.0832500100
[9] Held G A, Grinstein G, Tu Y. Relationship between gene expression and observed intensities in DNA microarrays-a modeling study. Nucleic Acids Res, 2006, 34(9): e70 · doi:10.1093/nar/gkl122
[10] Huang J, Wei W, Chen J, Zhang J, Liu G, Di X, Mei R, Ishikawa S, Aburatani H, Jones K W, et al. CARAT: a novel method for allelic detection of DNA copy number changes using high density oligonucleotide arrays. BMC Bioinformatics, 2006, 7: 83 · doi:10.1186/1471-2105-7-83
[11] Iafrate A J, Feuk L, Rivera MN, Listewnik ML, Donahoe P K, Qi Y, Scherer SW, Lee C. Detection of large-scale variation in the human genome. Nat Genet, 2004, 36(9): 949–951 · doi:10.1038/ng1416
[12] Johnson W E, Li W, Meyer C A, Gottardo R, Carroll J S, Brown M, Liu X S. Model-based analysis of tiling-arrays for ChIP-chip. Proc Natl Acad Sci USA, 2006, 103(33): 12457–12462 · doi:10.1073/pnas.0601180103
[13] Kapur K, Jiang H, Xing Y, Wong W H. Cross-hybridization modeling on Affymetrix exon arrays. Bioinformatics, 2008, 24(24): 2887–2893 · Zbl 05743656 · doi:10.1093/bioinformatics/btn571
[14] Korn J M, Kuruvilla F G, McCarroll S A, Wysoker A, Nemesh J, Cawley S, Hubbell E, Veitch J, Collins P J, Darvishi K, et al. Integrated genotype calling and association analysis of SNPs, common copy number polymorphisms and rare CNVs. Nat Genet, 2008, 40(10): 1253–1260 · doi:10.1038/ng.237
[15] Laframboise T, Harrington D, Weir B A. PLASQ: a generalized linear model-based procedure to determine allelic dosage in cancer cells from SNP array data. Biostatistics, 2007, 8(2): 323–336 · Zbl 1144.62098 · doi:10.1093/biostatistics/kxl012
[16] McCarroll S A, Kuruvilla F G, Korn J M, Cawley S, Nemesh J, Wysoker A, Shapero M H, de Bakker P I, Maller J B, Kirby A, et al. Integrated detection and populationgenetic analysis of SNPs and copy number variation. Nat Genet, 2008, 40(10): 1166–1174 · doi:10.1038/ng.238
[17] Nannya Y, Sanada M, Nakazaki K, Hosoya N, Wang L, Hangaishi A, Kurokawa M, Chiba S, Bailey D K, Kennedy G C, et al. A robust algorithm for copy number detection using high-density oligonucleotide single nucleotide polymorphism genotyping arrays. Cancer Res, 2005, 65(14): 6071–6079 · doi:10.1158/0008-5472.CAN-05-0465
[18] Olshen A B, Venkatraman E S, Lucito R, Wigler M. Circular binary segmentation for the analysis of array-based DNA copy number data. Biostatistics, 2004, 5(4): 557–572 · Zbl 1155.62478 · doi:10.1093/biostatistics/kxh008
[19] Ono N, Suzuki S, Furusawa C, Agata T, Kashiwagi A, Shimizu H, Yomo T. An improved physico-chemical model of hybridization on high-density oligonucleotide microarrays. Bioinformatics, 2008, 24(10): 1278–1285 · Zbl 05511618 · doi:10.1093/bioinformatics/btn109
[20] Pugh T J, Delaney A D, Farnoud N, Flibotte S, Griffith M, Li H I, Qian H, Farinha P, Gascoyne R D, Marra M A. Impact of whole genome amplification on analysis of copy number variants. Nucleic Acids Res, 2008, 36(13): e80 · doi:10.1093/nar/gkn378
[21] Rabbee N, Speed T P. A genotype calling algorithm for affymetrix SNP arrays. Bioinformatics, 2006, 22(1): 7–12 · doi:10.1093/bioinformatics/bti741
[22] Redon R, Ishikawa S, Fitch K R, Feuk L, Perry G H, Andrews T D, Fiegler H, Shapero M H, Carson A R, Chen W, et al. Global variation in copy number in the human genome. Nature, 2006, 444(7118): 444–454 · doi:10.1038/nature05329
[23] Scherer S W, Lee C, Birney E, Altshuler D M, Eichler E E, Carter N P, Hurles M E, Feuk L. Challenges and standards in integrating surveys of structural variation. Nat Genet, 2007, 39(7 Suppl): S7–15 · doi:10.1038/ng2093
[24] Sebat J, Lakshmi B, Troge J, Alexander J, Young J, Lundin P, Maner S, Massa H, Walker M, Chi M Y, et al. Large-scale copy number polymorphism in the human genome. Science, 2004, 305(5683): 525–528 · doi:10.1126/science.1098918
[25] Slater H R, Bailey D K, Ren H, Cao M, Bell K, Nasioulas S, Henke R, Choo K H, Kennedy G C. High-resolution identification of chromosomal abnormalities using oligonucleotide arrays containing 116,204 SNPs. Am J Hum Genet, 2005, 77(5): 709–726 · doi:10.1086/497343
[26] Wan L, Sun K, Ding Q, Cui Y, Li M, Wen Y, Elston R C, Qian M, Fu WJ. Hybridization modeling of oligonucleotide SNP arrays for accurate DNA copy number estimation. Nucleic Acids Res, 2009, 37(17): e117 · doi:10.1093/nar/gkp559
[27] Wan L, Xiao Y, Chen Q, Deng M, Qian M. The analysis of biases of copy numbers from Affymetrix SNP arrays. Communications in Information and Systems, 2010, 10(2): 81–96 · Zbl 1185.92049
[28] Weir B A, Woo MS, Getz G, Perner S, Ding L, Beroukhim R, Lin WM, Province MA, Kraja A, Johnson L A, et al. Characterizing the cancer genome in lung adenocarcinoma. Nature, 2007, 450(7171): 893–898 · doi:10.1038/nature06358
[29] Xiao Y, Segal M R, Yang Y H, Yeh R F. A multi-array multi-SNP genotyping algorithm for Affymetrix SNP microarrays. Bioinformatics, 2007, 23(12): 1459–1467 · doi:10.1093/bioinformatics/btm131
[30] Zhang L, Miles M F, Aldape K D. A model of molecular interactions on short oligonucleotide microarrays. Nat Biotechnol, 2003, 21(7): 818–821 · doi:10.1038/nbt836
[31] Zhang L, Wu C, Carta R, Zhao H. Free energy of DNA duplex formation on short oligonucleotide microarrays. Nucleic Acids Res, 2007, 35(3): e18 · doi:10.1093/nar/gkl1064
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. It attempts to reflect the references listed in the original paper as accurately as possible without claiming the completeness or perfect precision of the matching.