Delaigle, Aurore; Hall, Peter Effect of heavy tails on ultra high dimensional variable ranking methods. (English) Zbl 1257.62057 Stat. Sin. 22, No. 3, 909-932 (2012). Summary: Contemporary problems involving sparse, high-dimensional feature selection are becoming rapidly more challenging through substantial increases in dimension. This places ever more stress on methods for analysis, since the effects of even moderately heavy-tailed feature distributions become more significant as the number of features diverges. Data transformations have a significant role to play, reducing noise and enabling an increase in dimension, and for this reason they are increasingly used. We examine the performance of a typical transformation of this type, and study the extent to which it preserves the main attributes that lead to reliable feature selection. We show both numerically and theoretically that, in the presence of heavy-tailed data, the size of the dimension for which effective variable selection is possible can be increased dramatically, from a low-degree polynomial function of sample size to one that is exponentially large. Cited in 5 Documents MSC: 62G32 Statistics of extreme values; tail inference 62P10 Applications of statistics to biology and medical sciences; meta analysis 92C40 Biochemistry, molecular biology 92D10 Genetics and epigenetics 65C60 Computational problems in statistics (MSC2010) Keywords:correlations; feature selection; heavy tails; nonparametric statistics; Studentising; variable selection Software:corpor PDF BibTeX XML Cite \textit{A. Delaigle} and \textit{P. Hall}, Stat. Sin. 22, No. 3, 909--932 (2012; Zbl 1257.62057) Full Text: DOI Link OpenURL