CoRF swMATH ID: 42227 Software Authors: Dennis E. te Beest, Steven W. Mes, Saskia M. Wilting, Ruud H. Brakenhoff, Mark A. van de Wiel Description: R package CoRF: Co-data moderated randomForest. Paper: Improved high-dimensional prediction with Random Forests by the use of co-data. Results: Co-data are incorporated in the Random Forest by replacing the uniform sampling probabilities that are used to draw candidate variables by co-data moderated sampling probabilities. Co-data here are defined as any type information that is available on the variables of the primary data, but does not use its response labels. These moderated sampling probabilities are, inspired by empirical Bayes, learned from the data at hand. We demonstrate the co-data moderated Random Forest (CoRF) with two examples. In the first example we aim to predict the presence of a lymph node metastasis with gene expression data. We demonstrate how a set of external p-values, a gene signature, and the correlation between gene expression and DNA copy number can improve the predictive performance. In the second example we demonstrate how the prediction of cervical (pre-)cancer with methylation data can be improved by including the location of the probe relative to the known CpG islands, the number of CpG sites targeted by a probe, and a set of p-values from a related study. Homepage: https://link.springer.com/article/10.1186/s12859-017-1993-1 Source Code: https://github.com/DennisBeest/CoRF Dependencies: R Related Software: gren; glmnet; scam; mgcv; fwelnet; graper; GRridge; gglasso; grplasso; ggplot2; ggpubr; squeezy; R; ecpc; EMVS; ipflasso; EBayesThresh; BayesLogit; glasso Cited in: 1 Publication Cited by 3 Authors 1 De Wiel, Mark A. van 1 Münch, Magnus M. 1 Te Beest, Dennis E. Cited in 1 Serial 1 Scandinavian Journal of Statistics Cited in 1 Field 1 Statistics (62-XX) Citations by Year