×

zbMATH — the first resource for mathematics

RCRnorm: an integrated system of random-coefficient hierarchical regression models for normalizing nanostring nCounter data. (English) Zbl 1433.62306
Summary: Formalin-fixed paraffin-embedded (FFPE) samples have great potential for biomarker discovery, retrospective studies and diagnosis or prognosis of diseases. Their application, however, is hindered by the unsatisfactory performance of traditional gene expression profiling techniques on damaged RNAs. NanoString nCounter platform is well suited for profiling of FFPE samples and measures gene expression with high sensitivity which may greatly facilitate realization of scientific and clinical values of FFPE samples. However, methodological development for normalization, a critical step when analyzing this type of data, is far behind. Existing methods designed for the platform use information from different types of internal controls separately and rely on an overly-simplified assumption that expression of housekeeping genes is constant across samples for global scaling. Thus, these methods are not optimized for the nCounter system, not mentioning that they were not developed for FFPE samples. We construct an integrated system of random-coefficient hierarchical regression models to capture main patterns and characteristics observed from NanoString data of FFPE samples and develop a Bayesian approach to estimate parameters and normalize gene expression across samples. Our method, labeled RCRnorm, incorporates information from all aspects of the experimental design and simultaneously removes biases from various sources. It eliminates the unrealistic assumption on housekeeping genes and offers great interpretability. Furthermore, it is applicable to freshly frozen or like samples that can be generally viewed as a reduced case of FFPE samples. Simulation and applications showed the superior performance of RCRnorm.
MSC:
62P10 Applications of statistics to biology and medical sciences; meta analysis
92D10 Genetics and epigenetics
62G05 Nonparametric estimation
62J02 General nonlinear regression
PDF BibTeX XML Cite
Full Text: DOI Euclid
References:
[1] Abdueva, D., Wing, M., Schaub, B., Triche, T. and Davicioni, E. (2010). Quantitative expression profiling in formalin-fixed paraffin-embedded samples by affymetrix microarrays. J. Mol. Diagnostics 12 409-417.
[2] April, C., Klotzle, B., Royce, T., Wickham-Garcia, E., Boyaniwsky, T., Izzo, J., Cox, D., Jones, W., Rubio, R. et al. (2009). Whole-genome gene expression profiling of formalin-fixed, paraffin-embedded tissue samples. PLoS ONE 4 e8162.
[3] Chen, H.-Y., Yu, S.-L., Chen, C.-H., Chang, G.-C., Chen, C.-Y., Yuan, A., Cheng, C.-L., Wang, C.-H., Terng, H.-J. et al. (2007). A five-gene signature and clinical outcome in non-small-cell lung cancer. N. Engl. J. Med. 356 11-20.
[4] Efron, B. and Stein, C. (1981). The jackknife estimate of variance. Ann. Statist. 9 586-596. · Zbl 0481.62035
[5] Eisenberg, E. and Levanon, E. Y. (2013). Human housekeeping genes, revisited. Trends Genet. 29 569-574.
[6] Friedman, M. M. (1997). Clinical laboratory improvement amendment. Home Healthc. Now 15 393-395.
[7] Geiss, G. K., Bumgarner, R. E., Birditt, B., Dahl, T., Dowidar, N., Dunaway, D. L., Fell, H. P., Ferree, S., George, R. D. et al. (2008). Direct multiplexed measurement of gene expression with color-coded probe pairs. Nat. Biotechnol. 26 317-325.
[8] Gelman, A., Carlin, J. B., Stern, H. S., Dunson, D. B., Vehtari, A. and Rubin, D. B. (2014). Bayesian Data Analysis, 3rd ed. Texts in Statistical Science Series. CRC Press, Boca Raton, FL. · Zbl 1279.62004
[9] Gubern, C., Hurtado, O., Rodríguez, R., Morales, J. R., Romera, V. G., Moro, M. A., Lizasoain, I., Serena, J. and Mallolas, J. (2009). Validation of housekeeping genes for quantitative real-time PCR in in-vivo and in-vitro models of cerebral ischaemia. BMC Mol. Biol. 10 57.
[10] Iddawela, M., Rueda, O. M., Klarqvist, M., Graf, S., Earl, H. M. and Caldas, C. (2016). Reliable gene expression profiling of formalin-fixed paraffin-embedded breast cancer tissue (FFPE) using cDNA-mediated annealing, extension, selection, and ligation whole-genome (DASL WG) assay. BMC Med. Genomics 9 54.
[11] Irizarry, R. A., Hobbs, B., Collin, F., Beazer-Barclay, Y. D., Antonellis, K. J., Scherf, U. and Speed, T. P. (2003). Exploration, normalization, and summaries of high density oligonucleotide array probe level data. Biostatistics 4 249-264. · Zbl 1141.62348
[12] Jia, G., Wang, X., Li, Q., Lu, W., Tang, X., Wistuba, I. and Xie, Y. (2019). Supplement to “RCRnorm: An integrated system of random-coefficient hierarchical regression models for normalizing NanoString nCounter data.” DOI:10.1214/19-AOAS1249SUPP.
[13] Kulkarni, M. M. (2011). Digital multiplexed gene expression analysis using the NanoString nCounter system. Curr. Protoc. Mol. Biol. 94 25B.10.1-25B.10.17.
[14] Law, C. W., Chen, Y., Shi, W. and Smyth, G. K. (2014). voom: Precision weights unlock linear model analysis tools for RNA-seq read counts. Genome Biol. 15 R29.
[15] Li, C. and Wong, W. H. (2001). Model-based analysis of oligonucleotide arrays: Expression index computation and outlier detection. Proc. Natl. Acad. Sci. USA 98 31-36. · Zbl 0990.62091
[16] Lüder Ripoli, F., Mohr, A., Conradine Hammer, S., Willenbrock, S., Hewicker-Trautwein, M., Hennecke, S., Murua Escobar, H. and Nolte, I. (2016). A comparison of fresh frozen vs. formalin-fixed, paraffin-embedded specimens of canine mammary tumors via branched-DNA assay. Int. J. Mol. Sci. 17 724.
[17] Ludwig, J. A. and Weinstein, J. N. (2005). Biomarkers in cancer staging, prognosis and treatment selection. Nat. Rev. Cancer 5 845-856.
[18] Masuda, N., Ohnishi, T., Kawamoto, S., Monden, M. and Okubo, K. (1999). Analysis of chemical modification of RNA from formalin-fixed samples and optimization of molecular biology applications for such samples. Nucleic Acids Res. 27 4436-4443.
[19] NanoString Technologies, Inc. (2017). Gene Expression Data Analysis Guidelines—NanoString. MAN-C0011-04.
[20] Omolo, B., Yang, M., Lo, F. Y., Schell, M. J., Austin, S., Howard, K., Madan, A. and Yeatman, T. J. (2016). Adaptation of a RAS pathway activation signature from FF to FFPE tissues in colorectal cancer. BMC Med. Genomics 9 65.
[21] Paluch, B. E., Glenn, S. T., Conroy, J. M., Papanicolau-Sengos, A., Bshara, W., Omilian, A. R., Brese, E., Nesline, M., Burgher, B. et al. (2017). Robust detection of immune transcripts in FFPE samples using targeted RNA sequencing. Oncotarget 8 3197-3205.
[22] Perlmutter, M. A., Best, C. J., Gillespie, J. W., Gathright, Y., González, S., Velasco, A., Linehan, W. M., Emmert-Buck, M. R. and Chuaqui, R. F. (2004). Comparison of snap freezing versus ethanol fixation for gene expression profiling of tissue specimens. J. Mol. Diagnostics 6 371-377.
[23] Reis, P. P., Waldron, L., Goswami, R. S., Xu, W., Xuan, Y., Perez-Ordonez, B., Gullane, P., Irish, J., Jurisica, I. et al. (2011). mRNA transcript quantification in archival samples using multiplexed, color-coded probes. BMC Biotechnol. 11 46.
[24] Risso, D., Ngai, J., Speed, T. P. and Dudoit, S. (2014). Normalization of RNA-seq data using factor analysis of control genes or samples. Nat. Biotechnol. 32 896.
[25] Rosenfeld, N., Aharonov, R., Meiri, E., Rosenwald, S., Spector, Y., Zepeniuk, M., Benjamin, H., Shabes, N., Tabak, S. et al. (2008). MicroRNAs accurately identify cancer tissue origin. Nat. Biotechnol. 26 462-469.
[26] Shapiro, S. S. and Wilk, M. B. (1965). An analysis of variance test for normality: Complete samples. Biometrika 52 591-611. · Zbl 0134.36501
[27] Solassol, J., Ramos, J., Crapez, E., Saifi, M., Mangé, A., Vianès, E., Lamy, P.-J., Costes, V. and Maudelonde, T. (2011). Kras mutation detection in paired frozen and formalin-fixed paraffin-embedded (FFPE) colorectal cancer tissues. Int. J. Mol. Sci. 12 3191-3204.
[28] Stefan, U., Michael, B. and Werner, S. (2010). Effects of three different preservation methods on the mechanical properties of human and bovine cortical bone. Bone 47 1048-1053.
[29] Tang, H., Xiao, G., Behrens, C., Schiller, J., Allen, J., Chow, C.-W., Suraokar, M., Corvalan, A., Mao, J. et al. (2013). A 12-gene set predicts survival benefits from adjuvant chemotherapy in non-small cell lung cancer patients. Clin. Cancer Res. 19 1577-1586.
[30] Thompson, D., Botros, I., Rounseville, M., Liu, Q., Wang, E., Harrison, H. and Roche, P. (2014). Automated high fidelity RNA expression profiling using nuclease protection coupled with next generation sequencing. In Merck Technology Symposium, Long Branch, NJ 6.
[31] Vallejos, C. A., Risso, D., Scialdone, A., Dudoit, S. and Marioni, J. C. (2017). Normalizing single-cell RNA sequencing data: Challenges and opportunities. Nat. Methods 14 565-571.
[32] von Ahlfen, S., Missel, A., Bendrat, K. and Schlumpberger, M. (2007). Determinants of RNA quality from FFPE samples. PLoS ONE 2 e1261.
[33] Waggott, D., Chu, K., Yin, S., Wouters, B. G., Liu, F.-F. and Boutros, P. C. (2012). NanoStringNorm: An extensible R package for the pre-processing of NanoString mRNA and miRNA data. Bioinformatics 28 1546-1548.
[34] Wang, H., Horbinski, C., Wu, H., Liu, Y., Sheng, S., Liu, J., Weiss, H., Stromberg, A. J. and Wang, C. (2016). NanoStringDiff: A novel statistical method for differential expression analysis based on NanoString nCounter data. Nucleic Acids Res. 44 e151-e151.
[35] Xie, Y., Xiao, G., Coombes, K. R., Behrens, C., Solis, L. M., Raso, G., Girard, L., Erickson, H. S., Roth, J. et al. (2011). Robust gene expression signature from formalin-fixed paraffin-embedded samples predicts prognosis of non-small-cell lung cancer patients. Clin. Cancer Res. 17 5705-5714.
[36] Xie, Y., Lu, W., Wang, S., Tang, X., Tang, H., Zhou, Y., Moran, C., Behrens, C., Roth, J. et al. (2017). Validation of the 12-gene predictive signature for adjuvant chemotherapy response in lung cancer. J. Thorac. Oncol. 12 S1544.
[37] Ziober, A. F., Patel, K. R., Alawi, F., Gimotty, P., Weber, R. S., Feldman, M. M., Chalian, A. A., Weinstein, G. S., Hunt, J. et al. (2006). Identification of a gene signature for rapid screening of oral squamous cell carcinoma. Clin. Cancer Res. 12 5960-5971.
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. It attempts to reflect the references listed in the original paper as accurately as possible without claiming the completeness or perfect precision of the matching.