×

On the use of reproducing kernel Hilbert spaces in functional classification. (English) Zbl 1402.68152

Summary: The Hájek-Feldman dichotomy establishes that two Gaussian measures are either mutually absolutely continuous with respect to each other (and hence there is a Radon-Nikodym density for each measure with respect to the other one) or mutually singular. Unlike the case of finite-dimensional Gaussian measures, there are nontrivial examples of both situations when dealing with Gaussian stochastic processes. This article provides: (a) Explicit expressions for the optimal (Bayes) rule and the minimal classification error probability in several relevant problems of supervised binary classification of mutually absolutely continuous Gaussian processes. The approach relies on some classical results in the theory of reproducing kernel Hilbert spaces (RKHS). (b) An interpretation, in terms of mutual singularity, for the so-called “near perfect classification” phenomenon. We show that the asymptotically optimal rule proposed by these authors can be identified with the sequence of optimal rules for an approximating sequence of classification problems in the absolutely continuous case. (c) As an application, we discuss a natural variable selection method, which essentially consists of taking the original functional data \(X(t)\), \(t\in [0, 1]\) to a \(d\)-dimensional marginal \((X(t_1), \dots, X(t_d))\), which is chosen to minimize the classification error of the corresponding Fisher’s linear rule. We give precise conditions under which this discrimination method achieves the minimal classification error of the original functional problem.

MSC:

68T05 Learning and adaptive systems in artificial intelligence
46E22 Hilbert spaces with reproducing kernels (= (proper) functional Hilbert spaces, including de Branges-Rovnyak and other structured spaces)
62G08 Nonparametric regression and quantile regression
62H30 Classification and discrimination; cluster analysis (statistical aspects)
PDFBibTeX XMLCite
Full Text: DOI arXiv

References:

[1] Aneiros, G.; Vieu, P., Variable selection in infinite-dimensional problems, Probability Letters, 94, 12-20, (2014) · Zbl 1320.62163
[2] Baíllo, A.; Cuevas, A.; Cuesta-Albertos, J. A., Supervised classification for a family of Gaussian functional models, Scandinavian Journal of Statistics, 38, 480-498, (2011) · Zbl 1246.62155
[3] Baíllo, A.; Cuevas, A.; Fraiman, R.; Ferraty, F.; Romain, Y., Oxford Handbook of Functional Data Analysis, Classification methods with functional data, 259-297, (2011), Oxford University Press, Oxford
[4] Berlinet, A.; Thomas-Agnan, C., Reproducing Kernel Hilbert Spaces in Probability and Statistics, (2011), Springer, New York
[5] Berrendero, J. R.; Cuevas, A.; Torrecilla, J. L., The mrmr variable selection method: A comparative study for functional data, Journal of Statistical Computation and Simulation, 86, 891-907, (2016)
[6] Variable selection in functional data analysis: A maxima-hunting proposal, Statistica Sinica, 26, 619-638, (2016) · Zbl 1356.62079
[7] Bongiorno, E. G.; Goia, A., Classification methods for Hilbert data based on surrogate density, Computational Statistics and Data Analysis, 99, 204-222, (2016) · Zbl 1468.62030
[8] Cadre, B., Supervised classification of diffusion paths, Mathematical Methods of Statistics, 22, 213-235, (2013) · Zbl 1293.62069
[9] Cuevas, A., A partial overview of the theory of statistics with functional data, Journal of Statistical Planning and Inference, 147, 1-23, (2014) · Zbl 1278.62012
[10] Delaigle, A.; Hall, P., Achieving near perfect classification for functional data, Journal of the Royal Statistical Society, 74, 267-286, (2012)
[11] Methodology and theory for partial least squares applied to functional data, Annals of Mathematical Statistics, 40, 322-352, (2012) · Zbl 1246.62084
[12] Delaigle, A.; Hall, P.; Bathia, N., Componentwise classification and clustering of functional data, Biometrika, 99, 299-313, (2012) · Zbl 1244.62090
[13] Devroye, L.; Györfi, L.; Lugosi, G., A Probabilistic Theory of Pattern Recognition, (1996), Springer-Verlag, New York
[14] Feldman, J., Equivalence and perpendicularity of Gaussian processes, Pacific Journal of Mathematics, 8, 699-708, (1958) · Zbl 0084.13001
[15] Ferraty, F.; Vieu, P., Nonparametric Functional Data Analysis: Theory and Practice, (2006), Springer, New York · Zbl 1119.62046
[16] Hall, P.; Ferraty, F.; Romain, Y., Oxford Handbook of Functional Data Analysis, Principal component analysis for functional data. methodology, theory, and discussion, 210-234, (2011), Oxford University Press, Oxford
[17] Horváth, L.; Kokoszka, P., Inference for Functional Data with Applications, (2012), Springer, New York · Zbl 1279.62017
[18] Izenman, A. J., Modern Multivariate Statistical Techniques, (2008), Springer, New York
[19] Lindquist, M. A.; McKeague, I. W., Logistic regression with Brownian-like predictors, Journal of the American Statistical Association, 104, 1575-1585, (2009) · Zbl 1205.62125
[20] Mörters, P.; Peres, Y., Brownian Motion, (2010), Cambridge University Press, Cambridge
[21] Mosler, K.; Mozharovskyi, P., Fast DD-classification of functional data, Statistical Papers, 58, 1-35, (2015)
[22] 2009Encyclopedia of Mathematics
[23] Parzen, E., An approach to time series analysis, Journal of the American Statistical Association, 32, 951-989, (1961) · Zbl 0107.13801
[24] Extraction and detection problems and reproducing kernel Hilbert space, Journal of the Society for Industrial and Applied Mathematics Series A Control, 1, 35-62, (1962) · Zbl 0199.21904
[25] Segall, A.; Kailath, T., Radon-Nikodym derivatives with respect to measures induced by discontinuous independent-increment processes, Annals of Probability, 3, 449-464, (1975) · Zbl 0312.60023
[26] Shepp, L. A., Radon-Nikodym derivatives of Gaussian measures, Annals of Mathematical Statistics, 37, 321-354, (1966) · Zbl 0142.13901
[27] Varberg, D. E., On equivalence of Gaussian measures, Pacific Journal of Mathematics, 11, 751-762, (1961) · Zbl 0211.48002
[28] On Gaussian measures equivalent to Wiener measure, Transactions of the American Mathematical Society, 113, 262-273, (1964) · Zbl 0203.17506
[29] Wang, J.-L.; Chiou, J.-M.; Müller, H.-G., Review of functional data analysis, Annual Review of Statistics and Its Application, 3, 257-295, (2016)
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.