Deletion/substitution/addition algorithm in learning with applications in genomics. (English) Zbl 1166.62368

Summary: van der Laan and Dudoit provided a road map for estimation and performance assessment where a parameter of interest is defined as the risk minimizer for a suitable loss function and candidate estimators are generated using a loss function. After briefly reviewing this approach, this article proposes a general deletion/substitution/addition algorithm for minimizing, over subsets of variables (e.g., basis functions), the empirical risk of subset-specific estimators of the parameter of interest. This algorithm provides us with a new class of loss-based cross-validated algorithms in prediction of univariate outcomes, which can be extended to handle multivariate outcomes, conditional density and hazard estimation, and censored outcomes such as survival. In the context of regression, using polynomial basis functions, we study the properties of the deletion/substitution/addition algorithm in simulations and apply the method to detect transcription factor binding sites in yeast gene expression experiments.


62P10 Applications of statistics to biology and medical sciences; meta analysis
65C60 Computational problems in statistics (MSC2010)
92D10 Genetics and epigenetics
Full Text: DOI Link