The curse of dimensionality – a challenge for mathematical statistics. (English) Zbl 1002.62048

From the introduction: The oral presentation of this paper was given as a plenary lecture at the Annual Conference of the DMV (German National Mathematical Society). It was meant to give a partial overview on the problem of analyzing high-dimensional data and was addressed to an audience with a general mathematical background not necessarily familiar with special methods in mathematical statistics and data analysis. We therefore chose to start at a rather elementary level and to show principal problems as well as some solutions by means of prototypical examples such as regression modeling with high-dimensional influential variables and outlier identification in high-dimensional data from simple distributions.
The structure of this paper is a follows. In Section 2, we repeat some well-known facts about multidimensional data and explain what we mean by the curse of dimensionality. In Section 3, we consider some very simple statistical models in order to motivate the need for new methods in high dimensions. Finally, two special statistical tasks are discussed in more detail in the situation of high-dimensional data. These are the identification of outliers and dimension reduction in nonparametric regression. We finish with some concluding remarks.


62H99 Multivariate analysis
62G08 Nonparametric regression and quantile regression
62-07 Data analysis (statistics) (MSC2010)