# zbMATH — the first resource for mathematics

A recursive procedure for density estimation on the binary hypercube. (English) Zbl 1337.62070
Summary: This paper describes a recursive estimation procedure for multivariate binary densities (probability distributions of vectors of Bernoulli random variables) using orthogonal expansions. For $$d$$ covariates, there are $$2^{d}$$ basis coefficients to estimate, which renders conventional approaches computationally prohibitive when $$d$$ is large. However, for a wide class of densities that satisfy a certain sparsity condition, our estimator runs in probabilistic polynomial time and adapts to the unknown sparsity of the underlying density in two key ways: (1) it attains near-minimax mean-squared error for moderate sample sizes, and (2) the computational complexity is lower for sparser densities. Our method also allows for flexible control of the trade-off between mean-squared error and computational complexity.

##### MSC:
 62G07 Density estimation 62G20 Asymptotic properties of nonparametric inference 62C20 Minimax procedures in statistical decision theory
Full Text:
##### References:
 [1] J. Aitchison and C. G. G. Aitken. Multivariate binary discrimination by the kernel method. Biometrika , 63(3):413-420, 1976. · Zbl 0344.62035 [2] R. R. Bahadur. A representation of the joint distribution of $$n$$ dichotomous items. In H. Solomon, editor, Studies in Item Analysis and Prediction , pages 169-176. Stanford Univ. Press, 1961. · Zbl 0103.36701 [3] J. Bergh and J. Löfström. Interpolation Spaces: An Introduction . Springer-Verlag, 1976. [4] E. Candès. Modern statistical estimation via oracle inequalities. Acta Numerica , 15:257-325, 2006. · Zbl 1141.62001 [5] E. J. Candès and T. Tao. Near-optimal signal recovery from random projections: universal encoding strategies? IEEE Trans. Inform. Theory , 52(12):5406-5425, December 2006. · Zbl 1309.94033 [6] J. M. Carro. Estimating dynamic panel data discrete choice models with fixed effects. J. Econometrics , 140:503-528, 2007. · Zbl 1247.91130 [7] X. R. Chen, P. R. Krishnaiah, and W. W. Liang. Estimation of multivariate binary density using orthogonal functions. J. Multivariate Anal. , 31:178-186, 1989. · Zbl 0687.62040 [8] T. M. Cover and J. A. Thomas. Elements of Information Theory . Wiley, New York, 2nd edition, 2006. · Zbl 1140.94001 [9] I. Dinur, E. Friedgut, G. Kindler, and R. O’Donnell. On the Fourier tails of bounded functions over the discrete cube. Israel J. Math. , 160(389-412), 2007. · Zbl 1268.43003 [10] D. L. Donoho, I. M. Johnstone, G. Kerkyacharian, and D. Picard. Density estimation by wavelet thresholding. Ann. Statist. , 24(2):508-539, 1996. · Zbl 0860.62032 [11] S. Efromovich. Nonparametric Curve Estimation . Springer, 1999. · Zbl 0935.62039 [12] M. J. García-Zattera, A. Jara, E. Lesaffre, and D. Declerck. Conditional independence of multivariate binary data with an application in caries research. Computational Statistics and Data Analysis , 51:3223-3234, 2007. · Zbl 1161.62426 [13] Z. Ghahramani and K. Heller. Bayesian sets. In Y. Weiss, B. Schölkopf, and J. Platt, editors, Advances in Neural Information Processing Systems 18 , pages 435-442. MIT Press, Cambridge, MA, 2006. [14] A. C. Gilbert, S. Guha, P. Indyk, S. Muthukrishnan, and M. Strauss. Near-optimal sparse Fourier representations via sampling. In Proceedings of the Thiry-Fourth Annual ACM Symposium on Theory of Computing , 2002. · Zbl 1192.94078 [15] A. C. Gilbert, S. Muthukrishnan, and M. J. Strauss. Improved time bounds for near-optimal sparse Fourier representation via sampling. In Proc. SPIE Wavelets XI , San Diego, CA, 2005. [16] A. C. Gilbert and M. J. Strauss. Group testing in statistical signal recovery. Preprint, 2006. [17] O. Goldreich and L. Levin. A hard-core predicate for all one-way functions. In Proc. 21st ACM Symp. on Theory of Computing , pages 25-32, 1989. [18] M. Gyllenberg and T. Koski. Probabilistic models for bacterial taxonomy. International Statistical Review , 69(2):249-276, August 2001. · Zbl 1180.92004 [19] P. Hall, G. Kerkyacharian, and D. Picard. Block threshold rules for curve estimation using kernel and wavelet methods. Ann. Statist. , 26(3):922-942, 1998. · Zbl 0929.62040 [20] P. Hall, S. Penev, G. Kerkyacharian, and D. Picard. Numerical performance of block thresholded wavelet estimators. Statistics and Computing , 7:115-124, 1997. [21] I. M. Johnstone. Minimax Bayes, asymptotic minimax and sparse wavelet priors. In S. S. Gupta and J. O. Berger, editors, Statistical Decision Theory and Related Topics V , pages 303-326. Springer, 1994. · Zbl 0815.62017 [22] E. Kushilevitz and Y. Mansour. Learning decision trees using the Fourier spectrum. SIAM J. Comput. , 22(6):1331-1348, 1993. · Zbl 0799.68159 [23] S. L. Lauritzen. Graphical Models . Clarendon Press, Oxford, 1996. · Zbl 0907.62001 [24] W.-Q. Liang and P. R. Krishnaiah. Nonparametric iterative estimation of multivariate binary density. J. Multivariate Anal. , 16:162-172, 1985. · Zbl 0558.62034 [25] Y. Mansour. Learning Boolean functions via the Fourier transform. In V. P. Roychodhury, K.-Y. Siu, and A. Orlitsky, editors, Theoretical Advances in Neural Computation and Learning , pages 391-424. Kluwer, 1994. · Zbl 0845.68093 [26] P. Massart. Concentration Inequalities and Model Selection . Springer, 2007. · Zbl 1170.60006 [27] S. Mendelson. A few notes on statistical learning theory. In S. Mendelson and A. J. Smola, editors, Advanced Lectures in Machine Learning , volume 2600 of Lecture Notes in Computer Science . Springer, 2003. · Zbl 1019.68093 [28] J. Ott and R. A. Kronmal. Some classification procedures for multivariate binary data using orthogonal functions. J. Amer. Stat. Assoc. , 71(354):391-399, June 1976. · Zbl 0336.62044 [29] P. Reynaud-Bouret. Adaptive estimation of the intensity of inhomogeneous Poisson processes via concentration inequalities. Probab. Th. Rel. Fields , 126:103-153, 2003. · Zbl 1019.62079 [30] H. P. Rosenthal. On the span in $$l_{p}$$ of sequences of independent random variables. Israel J. Math. , 8:273-303, 1972. [31] I. Shmulevich and W. Zhang. Binary analysis and optimization-based normalization of gene expression data. Bioinformatics , 18(4):555-565, 2002. [32] J. Silva and R. Willett. Hypergraph-based detection of anomalous high-dimensional co-occurrences. IEEE Trans. Pattern Anal. Mach. Intel. , 31(3):563-569, 2009. [33] J. S. Simonoff. Smoothing categorical data. J. Statist. Planning and Inference , 47:41-60, 1995. · Zbl 0832.62053 [34] M. Talagrand. On Russo’s approximate zero-one law. Ann. Probab. , 22(3):1576-1587, 1994. · Zbl 0819.28002 [35] M. Talagrand. Sharper bounds for Gaussian and empirical processes. Ann. Probab. , 22:28-76, 1994. · Zbl 0798.60051 [36] T. Tao and V. H. Vu. Additive Combinatorics . Cambridge Univ. Press, 2006. · Zbl 1127.11002 [37] A. W. van der Vaart and J. A. Wellner. Weak Convergence and Empirical Processes . Springer, 1996. · Zbl 0862.60002 [38] J. D. Wilbur, J. K. Ghosh, C. H. Nakatsu, S. M. Brouder, and R. W. Doerge. Variable selection in high-dimensional multivariate binary data with applications to the analysis of microbial community DNA fingerprints. Biometrics , 58:378-386, June 2002. · Zbl 1209.62367 [39] Y. Yang and A. Barron. Information-theoretic determination of minimax rates of convergence. Technical Report 28, Department of Statistics, Iowa State University, 1997. · Zbl 0978.62008 [40] Y. Yang and A. Barron. Information-theoretic determination of minimax rates of convergence. Ann. Statist. , 27(5):1564-1599, 1999. · Zbl 0978.62008
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. It attempts to reflect the references listed in the original paper as accurately as possible without claiming the completeness or perfect precision of the matching.