## A deconvolution path for mixtures.(English)Zbl 1404.62033

In this paper the authors propose a nonparametric method for deconvolution that is both statistically and computationally efficient. The method is motivated in terms of an underlying Bayesian model incorporating a prior into the model $y_i|\mu_i\sim\phi(y_i|\mu_i),\quad \mu_i\sim f_0,\;\;\mu_i\;\;\text{i.i.d.}$
Instead using a full Bayes analysis the authors use a two-step “bin and smooth” procedure, which in the “bin” step forms a histogram of the sample, yielding the number of observations $$x_j$$ that fall into the $$j$$-th histogram bin. In the “smooth” step these counts are used to compute a maximum a posteriori (MAP) estimate of $$f_0$$ under a prior that encourages smoothness.
It is shown that the proposed nonparametric empirical-Bayes procedure yields excellent performance for deconvolution, at reduced computational cost compared to full nonparametric Bayesian methods. The main theorem establishes conditions under which the method yields a consistent estimate of the mixing distribution $$f_0$$. Simulation evidence that the method offers practical improvements over existing state-of-the-art methods is also provided.

### MSC:

 62G07 Density estimation 62H30 Classification and discrimination; cluster analysis (statistical aspects) 62G20 Asymptotic properties of nonparametric inference

### Software:

mixfdr; nlpden; smoothfdr; REBayes
Full Text:

### References:

 [1] S. Boyd, N. Parikh, E. Chu, B. Peleato, and J. Eckstein. Distributed optimization and statistical learning via the alternating direction method of multipliers., Foundations and Trends\^{}{®}in Machine Learning, 3(1):1-122, 2011. · Zbl 1229.90122 [2] L. D. Brown and E. Greenshtein. Nonparametric empirical bayes and compound decision approaches to estimation of a high-dimensional vector of normal means., The Annals of Statistics, pages 1685-1704, 2009. · Zbl 1166.62005 [3] R. Carroll, A. Delaigle, and P. Hall. Deconvolution when classifying noisy data involving transformations., Journal of the American Statistical Association, 107(499) :1166-1177, 2012. · Zbl 1443.62304 [4] R. J. Carroll and P. Hall. Optimal rates of convergence for deconvolving a density., Journal of the American Statistical Association, 83(404) :1184-1186, 1988. · Zbl 0673.62033 [5] A. Delaigle. Nonparametric kernel methods with errors-in-variables: Constructing estimators, computing them, and avoiding common mistakes., Australian & New Zealand Journal of Statistics, 56(2):105-124, 2014. · Zbl 1334.62006 [6] A. Delaigle and I. Gijbels. Estimation of integrated squared density derivatives from a contaminated sample., Journal of the Royal Statistical Society: Series B (Statistical Methodology), 64(4):869-886, 2002. · Zbl 1067.62034 [7] A. Delaigle and P. Hall. Parametrically assisted nonparametric estimation of a density in the deconvolution problem., Journal of the American Statistical Association, 109(506):717-729, 2014. · Zbl 1367.62104 [8] K.-A. Do, P. Muller, and F. Tang. A Bayesian mixture model for differential gene expression., Journal of the Royal Statistical Society, Series C, 54(3):627-44, 2005. · Zbl 1490.62353 [9] S. Donnet, V. Rivoirard, J. Rousseau, and C. Scricciolo. Posterior concentration rates for empirical bayes procedures, with applications to dirichlet process mixtures., arXiv preprint arXiv :1406.4406, 2014. · Zbl 1390.62008 [10] B. Efron. Tweedie’s formula and selection bias., Journal of the American Statistical Association, 106(496) :1602-14, 2011. · Zbl 1234.62007 [11] B. Efron. Empirical bayes deconvolution estimates., Biometrika, 103(1):1-20, 2016. · Zbl 1452.62220 [12] M. D. Escobar and M. West. Bayesian density estimation and inference using mixtures., Journal of the American Statistical Association, 90:577-88, 1995. · Zbl 0826.62021 [13] J. Fan. On the optimal rates of convergence for nonparametric deconvolution problems., The Annals of Statistics, pages 1257-1272, 1991. · Zbl 0729.62033 [14] J. Fan and J.-Y. Koo. Wavelet deconvolution., Information Theory, IEEE Transactions on, 48(3):734-747, 2002. [15] T. S. Ferguson. A Bayesian analysis of some nonparametric problems., The Annals of Statistics, 1:209-30, 1973. · Zbl 0255.62037 [16] S. Geman and C.-R. Hwang. Nonparametric maximum likelihood estimation by the method of sieves., The Annals of Statistics, 10(2):401-14, 1982. · Zbl 0494.62041 [17] S. Ghosal and A. W. Van Der Vaart. Entropies and rates of convergence for maximum likelihood and bayes estimation for mixtures of normal densities., The Annals of Statistics, pages 1233-1263, 2001. · Zbl 1043.62025 [18] I. J. Good and R. A. Gaskins. Nonparametric roughness penalties for probability densities., Biometrika, 58(2):255-77, 1971. · Zbl 0221.62012 [19] P. Hall, A. Meister, et al. A ridge-parameter approach to deconvolution., The Annals of Statistics, 35(4) :1535-1558, 2007. · Zbl 1147.62031 [20] H. Ishwaran and M. Zarepour. Exact and approximate sum representations for the dirichlet process., The Canadian Journal of Statistics/La Revue Canadienne de Statistique, pages 269-283, 2002. · Zbl 1035.60048 [21] W. Jiang and C.-H. Zhang. General maximum likelihood empirical bayes estimation of normal means., The Annals of Statistics, 37(4) :1647-1684, 2009. · Zbl 1168.62005 [22] N. A. Johnson. A dynamic programming algorithm for the fused lasso and l 0-segmentation., Journal of Computational and Graphical Statistics, 22(2):246-260, 2013. [23] J. Kiefer and J. Wolfowitz. Consistency of the maximum likelihood estimator in the presence of infinitely many incidental parameters., The Annals of Mathematical Statistics, 27:887-906, 1956. · Zbl 0073.14701 [24] R. Koenker. Rebayes: empirical bayes estimation and inference in r., R package version 0.41, 2013. [25] R. Koenker and I. Mizera. Convex optimization, shape constraints, compound decisions, and empirical bayes rules., Journal of the American Statistical Association, 109(506):674-685, 2014. · Zbl 1367.62020 [26] M. Lee, P. Hall, H. Shen, J. S. Marron, J. Tolle, and C. Burch. Deconvolution estimation of mixture distributions with boundaries., Electronic journal of statistics, 7:323, 2013. · Zbl 1337.62068 [27] R. Martin and S. T. Tokdar. Semiparametric inference in mixture models with predictive recursion marginal likelihood., Biometrika, 98(3):567-582, 2011. · Zbl 1231.62056 [28] R. Martin and S. T. Tokdar. A nonparametric empirical Bayes framework for large-scale multiple testing., Biostatistics, 13(3):427-39, 2012. · Zbl 1244.62066 [29] O. Muralidharan. An empirical bayes mixture method for effect size and false discovery rate estimation., The Annals of Applied Statistics, pages 422-438, 2010. · Zbl 1189.62004 [30] M. A. Newton. On a nonparametric recursive estimator of the mixing distribution., Sankhyā: The Indian Journal of Statistics, Series A, pages 306-322, 2002. · Zbl 1192.62110 [31] O. H. M. Padilla and J. G. Scott. Nonparametric density estimation by histogram trend filtering., arXiv preprint arXiv :1509.04348, 2015. [32] L. Rudin, S. Osher, and E. Faterni. Nonlinear total variation based noise removal algorithms., Physica D: Nonlinear Phenomena, 60(259-68), 1992. · Zbl 0780.49028 [33] A. Sarkar, B. K. Mallick, J. Staudenmayer, D. Pati, and R. J. Carroll. Bayesian semiparametric density deconvolution in the presence of conditionally heteroscedastic measurement errors., Journal of Computational and Graphical Statistics, 23(4) :1101-1125, 2014a. · Zbl 0780.49028 [34] A. Sarkar, D. Pati, B. K. Mallick, and R. J. Carroll. Bayesian semiparametric multivariate density deconvolution., arXiv preprint arXiv :1404.6462, 2014b. [35] B. W. Silverman. On the estimation of a probability density function by the maximum penalized likelihood method., The Annals of Statistics, pages 795-810, 1982. · Zbl 0492.62034 [36] D. Singh, P. G. Febbo, K. Ross, D. G. Jackson, J. Manola, C. Ladd, P. Tamayo, A. A. Renshaw, A. V. D’Amico, J. P. Richie, E. S. Lander, M. Loda, P. W. Kantoff, T. R. Golub, and W. R. Sellers. Gene expression correlates of clinical prostate cancer behavior., Cancer Cell, 1(2):203-9, 2002. · Zbl 0492.62034 [37] J. Staudenmayer, D. Ruppert, and J. P. Buonaccorsi. Density estimation in the presence of heteroscedastic measurement error., Journal of the American Statistical Association, 103(482):726-736, 2008. · Zbl 1471.62319 [38] L. A. Stefanski and R. J. Carroll. Deconvolving kernel density estimators., Statistics, 21(2):169-184, 1990. · Zbl 1471.62319 [39] R. Tibshirani, M. Saunders, S. Rosset, J. Zhu, and K. Knight. Sparsity and smoothness via the fused lasso., Journal of the Royal Statistical Society (Series B), 67:91-108, 2005. · Zbl 0697.62035 [40] R. J. Tibshirani. Adaptive piecewise polynomial estimation via trend filtering., The Annals of Statistics, 42(1):285-323, 2014. · Zbl 1307.62118 [41] R. J. Tibshirani and J. Taylor. Degrees of freedom in lasso problems., The Annals of Statistics, 40(2) :1198-1232, 2012. · Zbl 1060.62049 [42] S. T. Tokdar, R. Martin, and J. K. Ghosh. Consistency of a recursive estimate of mixing distributions., The Annals of Statistics, pages 2502-2522, 2009. · Zbl 1307.62118 [43] S. Wager. A geometric approach to density estimation with additive noise., Statistica Sinica, 2013. · Zbl 1274.62469 [44] A. Wald. Note on the consistency of the maximum likelihood estimate., The Annals of Mathematical Statistics, pages 595-601, 1949. · Zbl 1173.62020 [45] C.-H. Zhang. Fourier methods for estimating mixing densities and distributions., The Annals of Statistics, pages 806-831, 1990. · Zbl 1285.62039 [46] A. Wald. Note on the consistency of the maximum likelihood estimate., The Annals of Mathematical Statistics, pages 595-601, 1949. · Zbl 0034.22902 [47] C.-H. Zhang. Fourier methods for estimating mixing densities and distributions., The Annals of Statistics, pages 806-831, 1990. · Zbl 0778.62037
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. It attempts to reflect the references listed in the original paper as accurately as possible without claiming the completeness or perfect precision of the matching.