×

zbMATH — the first resource for mathematics

Fast rates for empirical vector quantization. (English) Zbl 1349.62038
Summary: We consider the rate of convergence of the expected loss of empirically optimal vector quantizers. Earlier results show that the mean-squared expected distortion for any fixed probability distribution supported on a bounded set and satisfying some regularity conditions decreases at the rate \(\mathcal{O}(\log n/n)\). We prove that this rate is actually \(\mathcal{O}(1/n)\). Although these conditions are hard to check, we show that well-clustered distributions with continuous densities supported on a bounded set are included in the scope of this result.

MSC:
62E17 Approximations to statistical distributions (nonasymptotic)
62H30 Classification and discrimination; cluster analysis (statistical aspects)
PDF BibTeX XML Cite
Full Text: DOI Euclid
References:
[1] András Antos. Improved minimax bounds on the test and training distortion of empirically designed vector quantizers. IEEE Trans. Inform. Theory , 51(11):4022-4032, 2005. · Zbl 1284.94038
[2] András Antos, László Györfi, and András György. Individual convergence rates in empirical vector quantizer design. IEEE Trans. Inform. Theory , 51(11):4013-4022, 2005. · Zbl 1284.94039
[3] Adrian Baddeley. Integrals on a moving manifold and geometrical probability. Advances in Appl. Probability , 9(3):588-603, 1977. · Zbl 0377.60013 · doi:10.2307/1426116
[4] Peter L. Bartlett, Tamás Linder, and Gábor Lugosi. The minimax distortion redundancy in empirical quantizer design. IEEE Trans. Inform. Theory , 44(5):1802-1813, 1998. · Zbl 0964.94015 · doi:10.1109/18.705560
[5] Gérard Biau, Luc Devroye, and Gábor Lugosi. On the performance of clustering in Hilbert spaces. IEEE Trans. Inform. Theory , 54(2):781-790, 2008. · Zbl 1304.62088
[6] Gilles Blanchard, Olivier Bousquet, and Pascal Massart. Statistical performance of support vector machines. Ann. Statist. , 36(2):489-531, 2008. · Zbl 1133.62044 · doi:10.1214/009053607000000839
[7] Olivier Bousquet. A Bennett concentration inequality and its application to suprema of empirical processes. C. R. Math. Acad. Sci. Paris , 334(6):495-500, 2002. · Zbl 1001.60021 · doi:10.1016/S1631-073X(02)02292-6
[8] Benoît Cadre and Quentin Paris. On Hölder fields clustering. TEST , 21(2):301-316, 2012. · Zbl 1259.62053
[9] Philip A. Chou. The distortion of vector quantizers trained on \(n\) vectors decreases to the optimum as \(\mathcal{O}_{p}(1/n)\). In Proc. IEEE Int. Symp. Inf. Theory , page 457, Trondheim, Norway, 1994.
[10] Aurélie Fischer. Quantization and clustering with Bregman divergences. J. Multivariate Anal. , 101(9):2207-2221, 2010. · Zbl 1201.62080 · doi:10.1016/j.jmva.2010.05.008
[11] Allen Gersho and Robert M. Gray. Vector quantization and signal compression . Kluwer Academic Publishers, Norwell, MA, USA, 1991. · Zbl 0782.94001
[12] Evarist Giné and Joel Zinn. Some limit theorems for empirical processes. Ann. Probab. , 12(4):929-998, 1984. With discussion. · Zbl 0553.60037 · doi:10.1214/aop/1176993138
[13] Siegfried Graf and Harald Luschgy. Foundations of quantization for probability distributions , volume 1730 of Lecture Notes in Mathematics . Springer-Verlag, Berlin, 2000. · Zbl 0951.60003
[14] Stefan Junglen. Geometry of optimal codebooks and constructive quantization . PhD thesis, Universität Trier, Universitätsring 15, 54296 Trier, 2012.
[15] Vladimir Koltchinskii. Local Rademacher complexities and oracle inequalities in risk minimization. Ann. Statist. , 34(6):2593-2656, 2006. · Zbl 1118.62065 · doi:10.1214/009053606000001019
[16] Tamás Linder. Learning-theoretic methods in vector quantization. In Principles of nonparametric learning (Udine, 2001) , volume 434 of CISM Courses and Lectures , pages 163-210. Springer, Vienna, 2002.
[17] Tamás Linder, Gábor Lugosi, and Kenneth Zeger. Rates of convergence in the source coding theorem, in empirical quantizer design, and in universal lossy source coding. IEEE Trans. Inform. Theory , 40(6):1728-1740, 1994. · Zbl 0826.94006 · doi:10.1109/18.340451
[18] Pascal Massart. Concentration inequalities and model selection , volume 1896 of Lecture Notes in Mathematics . Springer, Berlin, 2007. Lectures from the 33rd Summer School on Probability Theory held in Saint-Flour, July 6-23, 2003, With a foreword by Jean Picard. · Zbl 1170.60006
[19] Pascal Massart and Élodie Nédélec. Risk bounds for statistical learning. Ann. Statist. , 34(5):2326-2366, 2006. · Zbl 1108.62007 · doi:10.1214/009053606000000786
[20] Neri Merhav and Jacob Ziv. On the amount of statistical side information required for lossy data compression. IEEE Trans. Inform. Theory , 43(4):1112-1121, 1997. · Zbl 0878.94040 · doi:10.1109/18.605571
[21] David Pollard. Strong consistency of \(k\)-means clustering. Ann. Statist. , 9(1):135-140, 1981. · Zbl 0451.62048 · doi:10.1214/aos/1176345339
[22] David Pollard. A central limit theorem for empirical processes. J. Austral. Math. Soc. Ser. A , 33(2):235-248, 1982. · Zbl 0504.60023
[23] David Pollard. A central limit theorem for \(k\)-means clustering. Ann. Probab. , 10(4):919-926, 1982. · Zbl 0502.62055 · doi:10.1214/aop/1176993713
[24] David Pollard. Quantization and the method of k -means. IEEE Transactions on Information Theory , 28(2):199-204, 1982. · Zbl 0476.94010
[25] Thaddeus Tarpey. Principal points and self-consistent points of symmetric multivariate distributions. J. Multivariate Anal. , 53(1):39-51, 1995. · Zbl 0820.62047 · doi:10.1006/jmva.1995.1023
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. It attempts to reflect the references listed in the original paper as accurately as possible without claiming the completeness or perfect precision of the matching.