swMATH ID: 
21273

Software Authors: 
Benjamin R. Fitzpatrick, Kerrie Mengersen

Description: 
R package forestviews: A network flow approach to visualising the roles of covariates in random forests. We propose novel applications of parallel coordinates plots and Sankey diagrams to represent the hierarchies of interacting covariate effects in random forests. Each visualisation summarises the frequencies of all of the paths through all of the trees in a random forest. Visualisations of the roles of covariates in random forests include: ranked bar or dot charts depicting scalar metrics of the contributions of individual covariates to the predictive accuracy of the random forest; line graphs depicting various summaries of the effect of varying a particular covariate on the predictions from the random forest; heatmaps of metrics of the strengths of interactions between all pairs of covariates; and parallel coordinates plots for each response class depicting the distributions of the values of all covariates among the observations most representative of those predicted to belong that class. Together these visualisations facilitate substantial insights into the roles of covariates in a random forest but do not communicate the frequencies of the hierarchies of covariates effects across the random forest or the orders in which covariates occur in these hierarchies. Our visualisations address these gaps. We demonstrate our visualisations using a random forest fitted to publicly available data and provide a software implementation in the form of an R package. 
Homepage: 
https://github.com/brfitzpatrick/forestviews

Source Code: 
https://github.com/brfitzpatrick/forestviews

Dependencies: 
R 
Related Software: 
R;
dplyr;
tidyr;
forcats;
purrr;
ggplot2;
plotly;
UCIml;
mlbench;
caret;
viridis;
igraph;
shiny;
magrittr;
networkD3

Cited in: 
0 Documents
