minPtest: a resampling based gene region-level testing procedure for genetic case-control studies.

*(English)*Zbl 1306.65068Summary: Current technologies generate a huge number of single nucleotide polymorphism (SNP) genotype measurements in case-control studies. The resulting multiple testing problem can be ameliorated by considering candidate gene regions. The minPtest R package provides the first widely accessible implementation of a gene region-level summary for each candidate gene using the min \(P\) test. The latter is a permutation-based method that can be based on different univariate tests per SNP. The package brings together three different kinds of tests which were scattered over several R packages, and automatically selects the most appropriate one for the study design at hand. The implementation of the minPtest integrates two different parallel computing packages, thus optimally leveraging available resources for speedy results.

##### MSC:

65C60 | Computational problems in statistics (MSC2010) |

62P10 | Applications of statistics to biology and medical sciences; meta analysis |

##### Keywords:

single nucleotide polymorphism; gene region-level summary; min \(P\) test; permutation-based resampling
PDF
BibTeX
XML
Cite

\textit{S. Hieke} et al., Comput. Stat. 29, No. 1--2, 51--63 (2014; Zbl 1306.65068)

Full Text:
DOI

##### References:

[1] | Armitage, P, Tests for linear trends in proportions and frequencies, Biometrics, 11, 375-386, (1955) |

[2] | Benjamini, Y; Hochberg, Y, Controlling the false discovery rate: a practical and powerful approach to multiple testing, J R Stat Soc B, 57, 289-330, (1995) · Zbl 0809.62014 |

[3] | Carstensen B, Plummer M, Laara E, Laara M, et al (2010) Epi: a package for statistical analysis in epidemiology. http://CRAN.R-project.org/package=Epi, R package version 1.1.17 |

[4] | Chapman, J; Whittaker, J, Analysis of multiple SNPs in a candidate gene region, Genet Epidemiol, 32, 560-566, (2008) |

[5] | Chen, BE; Sakoda, LC; Hsing, AW; Rosenberg, PS, Resampling-based multiple hypothesis testing procedures for genetic case-control association studies, Genet Epidemiol, 30, 495-507, (2006) |

[6] | Clayton, D; Leung, H, An R package for analysis of whole-genome association studies, Hum Hered, 64, 45-51, (2007) |

[7] | Clayton D (2011) snpStats: SnpMatrix and XSnpMatrix classes and methods. http://www-gene.cimr.cam.ac.uk/clayton. R package version 1.2.1 |

[8] | Cochran, WG, Some methods for strengthening the common chi-squared tests, Biometrics, 10, 417-451, (1954) · Zbl 0059.12803 |

[9] | Eugster, MJA; Knaus, J; Porzelius, C; Schmidberger, M; Vicedo, E, Hands-on tutorial for parallel computing with R, Comput Stat, 26, 219-239, (2011) · Zbl 1304.65030 |

[10] | Gentleman, R; Carey, V; Bates, D; Bolstad, B; etal., Bioconductor: open software development for computational biology and bioinformatics, Genome Biol, 5, r80, (2004) |

[11] | Hahne F, Huber W, Gentleman R, Falcon S (2008) Bioconductor case studies. Springer, New York |

[12] | Hosgood, HD; Menashe, I; Shen, M; Yeager, M; etal., Pathway-based evaluation of 380 candidate genes and lung cancer susceptibility suggests the importance of the cell cycle pathway, Carcinogenesis, 29, 1938-1943, (2008) |

[13] | Knaus, J; Porzelius, C; Binder, H; Schwarzer, G, Easier parallel computing in R with snowfall and sfcluster, R J, 1, 54-59, (2009) |

[14] | Knaus J (2010) snowfall: Easier cluster computing (based on snow). http://CRAN.R-project.org/package=snowfall, R package version 1.84 · Zbl 0059.12803 |

[15] | Lan, Q; Wang, SS; Menashe, I; Armstrong, B; etal., Genetic variation in th1/th2 pathway genes and risk of non-Hodgkin lymphoma: a pooled analysis of three population-based case-control studies, Br J Hematol, 153, 341-350, (2011) |

[16] | Moore, LE; Brennan, P; Karami, S; etal., Apolipoprotein E/C1 locus variants modify renal cell carcinoma risk, Cancer Res, 69, 8001-8008, (2009) |

[17] | R Development Core Team (2010) R: a language and environment for statistical computing. R Foundation for Statistical Computing, Vienna. http://www.R-project.org/, ISBN 3-900051-07-0 |

[18] | Sauerbrei, W; Royston, P, Building multivariable prognostic and diagnostic models: transformation of the predictors by using fractional polynomials, J R Stat Soc Ser A Stat Soc, 162, 71-94, (1999) |

[19] | Scherag, A; Hebebrand, J; Wichmann, HE; Jöckel, KH, Evaluating strategies for marker ranking in genome-wide association studies of complex traits, Methods Inf Med, 49, 632-640, (2010) |

[20] | Schwender H, Fritsch A (2010) scrime: analysis of high-dimensional categorical data such as SNP data. http://CRAN.R-project.org/package=scrime, R package version 1.2.0 |

[21] | Schwender, H; Ruczinski, I; Ickstadt, K, Testing SNPs and sets of SNPs for importance in association studies, Biostatistics, 12, 18-32, (2011) |

[22] | Urbanek S, (2009) multicore: parallel processing of R code on machines with multiple cores or CPUs. http://RForge.net/multicore/, R package version 0.1-3 |

[23] | Wang, SS; Purdue, MP; Cerhan, JR; Zheng, T; etal., Common gene variants in the tumor necrosis factor (TNF) and TNF receptor superfamilies and NF-kb transcription factors and non-Hodgkin lymphoma risk, PLoS One, 4, e5360, (2009) |

[24] | Westfall, PH; Zaykin, DV; Young, SS, Multiple tests for genetic effects in association studies, Methods Mol Biol, 184, 143-168, (2002) |

[25] | Westfall PH, Young SS (1993) Resampling-based multiple testing: example and methods for p-value adjustment. Wiley, New York |

This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. It attempts to reflect the references listed in the original paper as accurately as possible without claiming the completeness or perfect precision of the matching.