Quantitative asymptotics of graphical projection pursuit. (English) Zbl 1189.60046

Summary: There is a result of P. Diaconis and D. Freedman [Ann. Stat. 12, 793–815 (1984; Zbl 0559.62002)] which says that, in a limiting sense, for large collections of high-dimensional data most one-dimensional projections of the data are approximately Gaussian. This paper gives quantitative versions of that result. For a set of deterministic vectors \(\{x_i\}^6_{i=1}\) in \(\mathbb R^d\) with \(n\) and \(d\) fixed, let \(\theta\in\mathbb S^{d-1}\) be a random point of the sphere and let \(\mu^\theta_n\) denote the random measure which puts mass \(\frac1n\) at each of the points \(\langle x_1,\theta\rangle,\dots,\langle x_n,\theta\rangle\). For a fixed bounded Lipschitz test function \(f\), \(Z\) a standard Gaussian random variable and \(\sigma^2\) a suitable constant, an explicit bound is derived for the quantity
\[ \mathbb P\left[\left|\int f d\mu^\theta_n-\mathbb E f(\sigma Z)\right|>\varepsilon\right]. \]
A bound is also given for \(\mathbb P[d_{BL}(\mu^\theta_n,{\mathcal N}(0,\sigma^2))>\varepsilon]\), where \(d_{BL}\) denotes the bounded-Lipschitz distance, which yields a lower bound on the waiting time to finding a non-Gaussian projection of the \(\{x_i\}\), if directions are tried independently and uniformly on \(\mathbb S^{d-1}\).


60E15 Inequalities; stochastic orderings
62E20 Asymptotic distribution theory in statistics


Zbl 0559.62002
Full Text: DOI arXiv EuDML EMIS