Zhang, Nancy R.; Siegmund, David O. A modified Bayes information criterion with applications to the analysis of comparative genomic hybridization data. (English) Zbl 1206.62174 Biometrics 63, No. 1, 22-32 (2007). Summary: In the analysis of data generated by change-point processes, one critical challenge is to determine the number of change-points. The classic Bayes information criterion (BIC) statistic does not work well here because of irregularities in the likelihood function. By asymptotic approximation of the Bayes factor, we derive a modified BIC for the model of Brownian motion with changing drift. The modified BIC is similar to the classic BIC in the sense that the first term consists of the log likelihood, but it differs in the terms that penalize for model dimension. As an example of application, this new statistic is used to analyze array-based comparative genomic hybridization (array-CGH) data. Array-CGH measures the number of chromosome copies at each genome location of a cell sample, and is useful for finding the regions of genome deletion and amplification in tumor cells. The modified BIC performs well compared to existing methods in accurately choosing the number of regions of changed copy number. Unlike existing methods, it does not rely on tuning parameters or intensive computing. Thus it is impartial and easier to understand and to use. Cited in 2 ReviewsCited in 88 Documents MSC: 62P10 Applications of statistics to biology and medical sciences; meta analysis 92C40 Biochemistry, molecular biology 62F15 Bayesian inference × Cite Format Result Cite Review PDF Full Text: DOI References: [1] Albertson, Chromosome aberrations in solid tumors, Nature Genetics 34 pp 369– (2003) · doi:10.1038/ng1215 [2] Birgé, Gaussian model selection, Journal of the European Mathematics Society 3 pp 203– (2001) · Zbl 1037.62001 · doi:10.1007/s100970100031 [3] Fridlyand, Application of hidden Markov models to the analysis of the array-CGH data, Journal of Multivariate Analysis 90 pp 132– (2004) · Zbl 1047.92026 · doi:10.1016/j.jmva.2004.02.008 [4] George, Variable selection via Gibbs sampling, Journal of the American Statistical Association 88 pp 881– (1993) · doi:10.2307/2290777 [5] Green, Reversible jump Markov chain Monte Carlo computation and Bayesian model determination, Biometrika 82 pp 711– (1995) · Zbl 0861.62023 · doi:10.1093/biomet/82.4.711 [6] Gu, Penalized likelihood density estimation: Direct cross-validation and scalable approximation, Statistica Sinica 13 pp 811– (2003) · Zbl 1028.62019 [7] Hsu, Denoising array-based comparative genomic hybridization data using wavelets, Biostatistics 6 pp 211– (2005) · Zbl 1071.62104 · doi:10.1093/biostatistics/kxi004 [8] James, Tests for a change-point, Biometrika 74 pp 71– (1987) · Zbl 0632.62021 · doi:10.1093/biomet/74.1.71 [9] Lavielle, Using penalized contrasts for the change-point problem, Signal Processing 85 pp 1501– (2005) · Zbl 1160.94341 · doi:10.1016/j.sigpro.2005.01.012 [10] Li , W. 2001 DNA segmentation as a model selection process Proceedings of the Fifth International Conference on Computational Biology T. Lengauer D. Sankoff S. Istrail P. Pevzner M. Waterman 204 210 Association for Computing Machinery [11] Olshen, Circular binary segmentation for the analysis of array-based DNA copy number data, Biostatistics 5 pp 557– (2004) · Zbl 1155.62478 · doi:10.1093/biostatistics/kxh008 [12] Picard, A statistical approach for array CGH data analysis, BMC Bioinformatics 6 pp 27– (2005) · doi:10.1186/1471-2105-6-27 [13] Pinkel, High resolution analysis of DNA copy number variation using comparative genomic hybridization to microarrays, Nature Genetics 20 pp 207– (1998) · doi:10.1038/2524 [14] Pollack, Genome-wide analysis of DNA copy-number changes using cDNA microarrays, Nature Genetics 23 pp 41– (1999) · doi:10.1038/14385 [15] Schwarz, Estimating the dimension of a model, Annals of Statistics 6 pp 461– (1978) · Zbl 0379.62005 · doi:10.1214/aos/1176344136 [16] Siegmund , D. 1992 Tail approximations for maxima of random fields Probability Theory: Proceedings of the 1989 Singapore Probability Conference L. H. Y. Chen K. P. Choi K. Yu J.-H. Lou 147 158 de Gruyter [17] Siegmund, Model selection in irregular problems: Applications to mapping quantitative trait loci, Biometrika 92 pp 785– (2004) · Zbl 1064.62114 · doi:10.1093/biomet/91.4.785 [18] Snijders, Assembly of microarrays for genome-wide measurement of DNA copy number, Nature Genetics 29 pp 263– (2001) · doi:10.1038/ng754 [19] Snijders, Shaping of tumor and drug-resistant genomes by instability and selection, Oncogene 22 pp 4370– (2003) · doi:10.1038/sj.onc.1206482 [20] Tibshirani, Regression shrinkage and selection via the lasso, Journal of the Royal Statistical Society, Series B 58 pp 267– (1996) · Zbl 0850.62538 [21] Wang, A method for calling gains and losses in array-CGH data, Biostatistics 6 pp 45– (2005) · Zbl 1069.92014 · doi:10.1093/biostatistics/kxh017 [22] Yao, Estimating the number of change-points via Schwarz criterion, Statistics and Probability Letters 6 pp 181– (1988) · Zbl 0642.62016 · doi:10.1016/0167-7152(88)90118-6 [23] Zhang , N. R. 2005 Change-point detection and sequence alignment: Statistical problems of genomics Ph.D. Thesis This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.