Gene-level pharmacogenetic analysis on survival outcomes using gene-trait similarity regression. (English) Zbl 1454.62412

Summary: Gene/pathway-based methods are drawing significant attention due to their usefulness in detecting rare and common variants that affect disease susceptibility. The biological mechanism of drug responses indicates that a gene-based analysis has even greater potential in pharmacogenetics. Motivated by a study from the Vitamin Intervention for Stroke Prevention (VISP) trial, we develop a gene-trait similarity regression for survival analysis to assess the effect of a gene or pathway on time-to-event outcomes. The similarity regression has a general framework that covers a range of survival models, such as the proportional hazards model and the proportional odds model. The inference procedure developed under the proportional hazards model is robust against model misspecification. We derive the equivalence between the similarity survival regression and a random effects model, which further unifies the current variance component-based methods. We demonstrate the effectiveness of the proposed method through simulation studies. In addition, we apply the method to the VISP trial data to identify the genes that exhibit an association with the risk of a recurrent stroke. The TCN2 gene was found to be associated with the recurrent stroke risk in the low-dose arm. This gene may impact recurrent stroke risk in response to cofactor therapy.


62P10 Applications of statistics to biology and medical sciences; meta analysis
62N02 Estimation in survival analysis and censored data
Full Text: DOI arXiv Euclid


[1] Afman, L. A., Lievers, K. J. A., Kluijtmans, L. A. J., Trijbels, F. J. M. and Blom, H. J. (2003). Gene-gene interaction between the cystathionine beta-synthase 31 base pair variable number of tandem repeats and the methylenetetrahydrofolate reductase 677C\(>\)T polymorphism on homocysteine levels and risk for neural tube defects. Mol. Genet. Metab. 78 211-215.
[2] Beckmann, L., Thomas, D. C., Fischer, C. and Chang-Claude, J. (2005). Haplotype sharing analysis using mantel statistics. Hum. Hered. 59 67-78.
[3] Bennett, S. (1983). Analysis of survival data by the proportional odds model. Stat. Med. 2 273-277.
[4] Cai, T., Tonini, G. and Lin, X. (2011). Kernel machine approach to testing the significance of multiple genetic markers for risk prediction. Biometrics 67 975-986. · Zbl 1226.62105
[5] Chen, K., Jin, Z. and Ying, Z. (2002). Semiparametric analysis of transformation models with censored data. Biometrika 89 659-668. · Zbl 1039.62094
[6] Cheng, S. C., Wei, L. J. and Ying, Z. (1995). Analysis of transformation models with censored data. Biometrika 82 835-845. · Zbl 0861.62071
[7] Cox, D. R. (1972). Regression models and life-tables. J. R. Stat. Soc. Ser. B Stat. Methodol. 34 187-220. · Zbl 0243.62041
[8] Duchesne, P. and Lafaye De Micheaux, P. (2010). Computing the distribution of quadratic forms: Further comparisons between the Liu-Tang-Zhang approximation and exact methods. Comput. Statist. Data Anal. 54 858-862. · Zbl 1465.62010
[9] Elston, R. C., Buxbaum, S., Jacobs, K. B. and Olson, J. M. (2000). Haseman and Elston revisited. Genet. Epidemiol. 19 1-17.
[10] Giusti, B., Saracinim, C., Bolli, P., Magi, A., Martinelli, I., Peyvandi, F., Rasura, M., Volpe, M., Lotta, L. A., Rubattu, S., Mannucci, P. M. and Abbate, R. (2010). Early-onset ischaemic stroke: Analysis of 58 polymorphisms in 17 genes involved in methionine metabolism. Thrombosis and Haemostasis 104 231-242.
[11] Goeman, J. J., Oosting, J., Cleton-Jansen, A.-M., Anninga, J. K. and van Houwelingen, H. C. (2005). Testing association of a pathway with survival using gene expression data. Bioinformatics 21 1950-1957.
[12] Goldstein, D. B. (2005). The genetics of human drug response. Philosophical Transactions of the Royal Society B : Biological Sciences 360 1571-1572.
[13] Goldstein, D. B., Tate, S. K. and Sisodiya, S. M. (2003). Pharmacogenetics goes genomic. Nat. Rev. Genet. 4 937-947.
[14] Haseman, J. K. and Elston, R. C. (1972). The investigation of linkage between a quantitative trait and a marker locus. Behav. Genet. 2 3-19.
[15] Hsu, F. C., Sides, E. G. and Mychaleckyj, J. C.et al. (2011). A Transcobalamin 2 gene variant associated with post-stroke homocysteine modifies recurrent stroke risk. Neurology 77 1543-1550.
[16] Li, H. and Luan, Y. (2005). Boosting proportional hazards models using smoothing spline, with application to high-dimensional microarray data. Biostatistics 21 2403-2409.
[17] Lin, D. Y. and Wei, L. J. (1989). The robust inference for the Cox proportional hazards model. J. Amer. Statist. Assoc. 84 1074-1078. · Zbl 0702.62042
[18] Lin, W.-Y. and Schaid, D. J. (2009). Power comparisons between similarity-based multilocus association methods, logistic regression, and score tests for haplotypes. Genet. Epidemiol. 33 183-197.
[19] Lin, X., Cai, T., Wu, M. C., Zhou, Q., Liu, G., Christiani, D. C. and Lin, X. (2011). Kernel machine SNP-set analysis for censored survival outcomes in genome-wide association studies. Genet. Epidemiol. 35 620-631.
[20] Low, H.-Q., Chen, C. P. L. H., Kasiman, K., Thalamuthu, A., Ng, S.-S., Foo, J.-N., Chang, H.-M., Wong, M.-C., Tai, E.-S. and Liu, J. (2011). A comprehensive association analysis of homocysteine metabolic pathway genes in Singaporean Chinese with ischemic stroke. PLoS ONE 6 e24757.
[21] Moskvina, V. and Schmidt, K. M. (2008). On multiple-testing correction in genome-wide association studies. Genet. Epidemiol. 32 567-573.
[22] Pearson, E. S. (1959). Note on an approximation to the distribution of non-central \(\chi_{2}\). Biometrika 46 364. · Zbl 0101.35806
[23] Pongpanich, M., Neely, M. and Tzeng, J. Y. (2012). On the aggregation of multimarker information for marker-set and sequencing data analysis: Genotype collapsing vs. similarity collapsing. Frontiers in Statistical Genetics and Methodology 2 110.
[24] Price, A. L., Kryukov, G. V., de Bakker, P. I. W., Purcell, S. M., Staples, J., Wei, L.-J. and Sunyaev, S. R. (2010). Pooled association tests for rare variants in exon-resequencing studies. Am. J. Hum. Genet. 86 832-838.
[25] Qian, D. and Thomas, D. (2001). Genome scan of complex traits by haplotype sharing correlation. Genetic Epidemiology 21 S582-S587.
[26] Schaid, D. J. (2010a). Genomic similarity and kernel methods I: Advancements by building on mathematical and statistical foundations. Human Heredity 70 109-131.
[27] Schaid, D. J. (2010b). Genomic similarity and kernel methods II: Methods for genomic information. Human Heredity 70 132-140.
[28] Toole, J. F., Malinow, M. R., Chambless, L. E. et al. (2004). Lowering homocysteine in patients with ischemic stroke to prevent recurrent stroke, myocardial infarction, and death: The Vitamin Intervention for Stroke Prevention (VISP) randomized controlled trial. Journal of American Medical Association 291 565-575.
[29] Tzeng, J. Y., Devlin, D., Wasserman, L. and Roeder, K. (2003). On the identification of disease mutations by the analysis of haplotype similarity and goodness of fit. The American Journal of Human Genetics 72 891-902.
[30] Tzeng, J.-Y., Zhang, D., Chang, S.-M., Thomas, D. C. and Davidian, M. (2009). Gene-trait similarity regression for multimarker-based association analysis. Biometrics 65 822-832. · Zbl 1172.62064
[31] Tzeng, J.-Y., Zhang, D., Pongpanich, M., Smith, C., McCarthy, M. I., Sale, M. M., Worrall, B. B., Hsu, F.-C., Thomas, D. C. and Sullivan, P. F. (2011). Studying gene and gene-environment effects of uncommon and common variants on continuous traits: A marker-set approach using gene-trait similarity regression. Am. J. Hum. Genet. 89 277-288.
[32] von Castel-Dunwoody, K. M., Kauwell, G. P. A., Shelnutt, K. P., Vaughn, J. D., Griffin, E. R., Maneval, D. R., Theriaque, D. W. and Bailey, L. B. (2005). Transcobalamin 776C\({}\to{}\)G polymorphism negatively affects vitamin B-12 metabolism. Am. J. Clin. Nutr. 81 1436-1441.
[33] Wang, J., Huff, A. M., Spence, J. D. and Hegele, R. A. (2004). Single nucleotide polymorphism in CTH associated with variation in plasma homocysteine concentration. Clin. Genet. 65 483-486.
[34] Wessel, J. and Schork, N. J. (2006). Generalized genomic distance-based regression methodology for multilocus association analysis. Am. J. Hum. Genet. 79 792-806.
[35] Wu, M. C., Lee, S., Cai, T., Li, Y., Boehnke, M. and Lin, X. (2011). Rare-variant association testing for sequencing data with the sequence kernel association test. Am. J. Hum. Genet. 89 82-93.
[36] Zeng, D. and Lin, D. Y. (2006). Efficient estimation of semiparametric transformation models for counting processes. Biometrika 93 627-640. · Zbl 1108.62083
[37] Zhong, P.-S. and Chen, S. X. (2011). Tests for high-dimensional regression coefficients with factorial designs. J. Amer. Statist. Assoc. 106 260-274. · Zbl 1396.62110
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. It attempts to reflect the references listed in the original paper as accurately as possible without claiming the completeness or perfect precision of the matching.