×

Noisy independent factor analysis model for density estimation and classification. (English) Zbl 1329.62273

Summary: We consider the problem of multivariate density estimation when the unknown density is assumed to follow a particular form of dimensionality reduction, a noisy independent factor analysis (IFA) model. In this model the data are generated by a number of latent independent components having unknown distributions and are observed in Gaussian noise. We do not assume that either the number of components or the matrix mixing the components are known. We show that the densities of this form can be estimated with a fast rate. Using the mirror averaging aggregation algorithm, we construct a density estimator which achieves a nearly parametric rate \((\log^{1/4} n)/\sqrt{n}\), independent of the dimensionality of the data, as the sample size \(n\) tends to infinity. This estimator is adaptive to the number of components, their distributions and the mixing matrix. We then apply this density estimator to construct nonparametric plug-in classifiers and show that they achieve the best obtainable rate of the excess Bayes risk, to within a logarithmic factor independent of the dimension of the data. Applications of this classifier to simulated data sets and to real data from a remote sensing experiment show promising results.

MSC:

62H25 Factor analysis and principal components; correspondence analysis
62G07 Density estimation
62H30 Classification and discrimination; cluster analysis (statistical aspects)

Software:

KernSmooth

References:

[1] Aladjem, M. (2005). Projection Pursuit Mixture Density Estimation., IEEE Trans. Signal Process. 53 4376-4383. · Zbl 1370.94062 · doi:10.1109/TSP.2005.857007
[2] Amato, U., Antoniadis, A., and Grégoire, G. (2003). Independent Component Discriminant Analysis., Int. J. Math. 3 735-753. · Zbl 1190.62115
[3] Anderson, T. W., and Rubin, H. (1956). Statistical inference in factor analysis., Proc. Third Berkeley Symposium on Mathematical Statistics and Probability (Vol. V), ed. J. Neyman. Berkeley and Los Angeles, University of California Press, 111-150. · Zbl 0070.14703
[4] An, Y., Hu, X., and Xu, L. (2006). A comparative investigation on model selection in independent factor analysis., J. Math. Modeling Algorithms 5 447-473. · Zbl 1107.62336 · doi:10.1007/s10852-005-9021-2
[5] Artiles, L. M. (2001)., Adaptive minimax estimation in classes of smooth functions . University of Utrecht, Ph.D. thesis.
[6] Attias, H. (1999). Independent Factor Analysis., Neural Computation 11 803-851.
[7] Audibert, J. U., and Tsybakov, A. B. (2007). Fast learning rates for plug-in classifiers., Annals Statist. 35 608-633. · Zbl 1118.62041 · doi:10.1214/009053606000001217
[8] Belitser, E., and Levit, B. (2001). Asymptotically local minimax estimation of infinitely smooth density with censored data., Annals Inst. Statist. Math. 53 289-306. · Zbl 0998.62026 · doi:10.1023/A:1012418722154
[9] Blanchard, B., Kawanabe, G. M., Sugiyama, M., Spokoiny, V., and Müller, K. R. (2006). In search of non-gaussian components of a high-dimensional distribution., J. of Mach. Learn. Research 7 247-282. · Zbl 1222.62009
[10] Cook, R. D., and Li, B. (2002). Dimension reduction for conditional mean in regression., Annals Statist. 32 455-474. · Zbl 1012.62035 · doi:10.1214/aos/1021379861
[11] Devroye, L., Györfi, L., and Lugosi, G. (1996)., A Probabilistic Theory of Pattern Recognition . New York, Springer. · Zbl 0853.68150
[12] Glad, I. K., Hjort, N. L., and Ushakov, N.G. (2003). Correction of density estimators that are not densities., Scand. J. Statist. 30 415-427. · Zbl 1051.60037 · doi:10.1111/1467-9469.00339
[13] Hall, P., and Murison, R. D. (1993). Correcting the negativity of high-order kernel density estimators., J. Multivar. Analysis 47 103-122. · Zbl 0778.62034 · doi:10.1006/jmva.1993.1073
[14] Hastie, T., Tibshirani, R., and Buja, A. (1994). Flexible Discriminant Analysis by Optimal Scoring., J. Am. Statist. Assoc. 89 1255-1270. · Zbl 0812.62067 · doi:10.2307/2290989
[15] Hoeffding, W. (1963). Probability inequalities for sums of bounded random variables., J. Am. Statist. Assoc. 58 13-30. · Zbl 0127.10602 · doi:10.2307/2282952
[16] Hyvarinen, A., Karhunen, J., and Oja, E. (2001)., Independent Component Analysis . New York, Wiley and Sons.
[17] Ibragimov, I. A., and Khasminskiĭ, R. Z. (1982). An estimate of the density of a distribution distribution belonging to a class of entire functions (Russian)., Teoriya Veroyatnostei i ee Primeneniya 27 514-524. · Zbl 0495.62047
[18] Juditsky, A. B., Nazin, A. V, Tsybakov, A. B., and Vayatis, N. (2005). Recursive Aggregation of Estimators by the Mirror Descent Algorithm with Averaging., Problems Informat. Transmiss. 41 368-384. · Zbl 1123.62044 · doi:10.1007/s11122-006-0005-2
[19] Juditsky, A., Rigollet, P., and Tsybakov, A. B. (2008). Learning by mirror averaging., Annals Statist. 36 2183-2206. · Zbl 1274.62288 · doi:10.1214/07-AOS546
[20] Kawanabe, M., Sugiyama, M., Blanchard, G., and Müller, K. R. (2007). A new algorithm of non-Gaussian component analysis with radial kernel functions., Annals Inst. Statist. Math. 59 57-75. · Zbl 1147.62349 · doi:10.1007/s10463-006-0098-9
[21] Kneip, A., and Utikal, K. (2001). Inference for density families using functional principal components analysis (with discussion)., J. Am. Statist. Assoc. 96 519-542. · Zbl 1019.62060 · doi:10.1198/016214501753168235
[22] McLachlan, G.J., and Peel, D. (2000)., Finite Mixture Models . New York, Wiley. · Zbl 0963.62061
[23] Montanari, A., Calò, D., and Viroli, C. (2008). Independent factor discriminant analysis., Comput. Statist. Data Anal. 52 3246-3254. · Zbl 1452.62457
[24] Platnick, S., King, M. D., Ackerman, S. A., Menzel, W. P, Baum, P. A., Ridi, J. C, and Frey, R. A. (2003). The MODIS cloud products: Algorithms and examples from Terra., IEEE Trans. Geosc. Remote Sens. 41 459-473.
[25] Polzehl, J. (1995). Projection pursuit discriminant analysis., Comput. Statist. Data Anal. 20 141-157. · Zbl 0875.62272 · doi:10.1016/0167-9473(94)00035-H
[26] Roweis, S., and Saul, L. (2000). Nonlinear dimensionality reduction by locally linear embedding., Science 290 2323-2326.
[27] Samarov, A., and Tsybakov, A. B. (2004). Nonparametric independent component analysis., Bernoulli 10 565-582. · Zbl 1055.62037 · doi:10.3150/bj/1093265630
[28] Samarov, A., and Tsybakov, A. B. (2007). Aggregation of density estimators and dimension reduction. In, Advances in Statistical Modeling and Inference, Essays in Honor of K. Doksum , Series in Biostatistics (Vol. 3), V. Nair (ed.). London, World Scientific 233-251. · doi:10.1142/9789812708298_0012
[29] Silverman, B. W. (1982). Kernel density estimation using the fast Fourier transform., Appl. Statist. 31 93-99. · Zbl 0483.62032 · doi:10.2307/2347084
[30] Stewart, G. W., Sun, J. (1990), Matrix Perturbation Theory . New York, Academic Press. · Zbl 0706.65013
[31] Tenenbaum, J. B., de Silva, V., and Langford, J. C. (2000). A global geometric framework for nonlinear dimensionality reduction., Science 290 2319-2323.
[32] Titterington, D., A. Smith, and Makov, U. (1985)., Statistical Analysis of Finite Mixture Distributions . New York, Wiley. · Zbl 0646.62013
[33] Tsybakov, A. B. (2009), Introduction to Nonparametric Estimation . New York, Springer. · Zbl 1176.62032
[34] Wand, M. P., and Jones, M. C. (1995)., Kernel Smoothing . London, Chapman & Hall/CRC. · Zbl 0854.62043
[35] Yang, Y. (1999). Minimax nonparametric classification. I. Rates of convergence. II. Model selection for adaptation., IEEE Trans. Inform. Theory 45 2271-2292. · Zbl 0962.62026 · doi:10.1109/18.796368
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.