Empirical Bayes selection of wavelet thresholds.

*(English)*Zbl 1078.62005Summary: This paper explores a class of empirical Bayes methods for level-dependent threshold selection in wavelet shrinkage. The prior considered for each wavelet coefficient is a mixture of an atom of probability at zero and a heavy-tailed density. The mixing weight, or sparsity parameter, for each level of the transform is chosen by marginal maximum likelihood. If estimation is carried out using the posterior median, this is a random thresholding procedure; the estimation can also be carried out using other thresholding rules with the same threshold. Details of the calculations needed for implementing the procedure are included. In practice, the estimates are quick to compute and there is software available. Simulations on the standard model functions show excellent performance, and applications to data drawn from various fields of application are used to explore the practical performance of the approach.

By using a general result on the risk of the corresponding marginal maximum likelihood approach for a single sequence, overall bounds on the risk of the method are found subject to membership of the unknown function in one of a wide range of Besov classes, covering also the case of \(f\) of bounded variation. The rates obtained are optimal for any value of the parameter \(p\) in \((0,\infty]\), simultaneously for a wide range of loss functions, each dominating the \(L_q\) norm of the \(\sigma \) th derivative, with \(\sigma\geq 0\) and \(0<q\leq 2\).

Attention is paid to the distinction between sampling the unknown function within white noise and sampling at discrete points, and between placing constraints on the function itself and on the discrete wavelet transform of its sequence of values at the observation points. Results for all relevant combinations of these scenarios are obtained. In some cases a key feature of the theory is a particular boundary-corrected wavelet basis, details of which are discussed.

Overall, the approach described seems so far unique in combining the properties of fast computation good theoretical properties and good performance in simulations and in practice. A key feature appears to be that the estimate of sparsity adapts to three different zones of estimation, first where the signal is not sparse enough for thresholding to be of benefit, second where an appropriately chosen threshold results in substantially improved estimation, and third where the signal is so sparse that the zero estimate gives the optimum accuracy rate.

By using a general result on the risk of the corresponding marginal maximum likelihood approach for a single sequence, overall bounds on the risk of the method are found subject to membership of the unknown function in one of a wide range of Besov classes, covering also the case of \(f\) of bounded variation. The rates obtained are optimal for any value of the parameter \(p\) in \((0,\infty]\), simultaneously for a wide range of loss functions, each dominating the \(L_q\) norm of the \(\sigma \) th derivative, with \(\sigma\geq 0\) and \(0<q\leq 2\).

Attention is paid to the distinction between sampling the unknown function within white noise and sampling at discrete points, and between placing constraints on the function itself and on the discrete wavelet transform of its sequence of values at the observation points. Results for all relevant combinations of these scenarios are obtained. In some cases a key feature of the theory is a particular boundary-corrected wavelet basis, details of which are discussed.

Overall, the approach described seems so far unique in combining the properties of fast computation good theoretical properties and good performance in simulations and in practice. A key feature appears to be that the estimate of sparsity adapts to three different zones of estimation, first where the signal is not sparse enough for thresholding to be of benefit, second where an appropriately chosen threshold results in substantially improved estimation, and third where the signal is so sparse that the zero estimate gives the optimum accuracy rate.

##### MSC:

62C12 | Empirical decision procedures; empirical Bayes procedures |

62G08 | Nonparametric regression and quantile regression |

65T60 | Numerical methods for wavelets |

65C60 | Computational problems in statistics (MSC2010) |

62G20 | Asymptotic properties of nonparametric inference |

62H35 | Image analysis in multivariate analysis |

PDF
BibTeX
XML
Cite

\textit{I. M. Johnstone} and \textit{B. W. Silverman}, Ann. Stat. 33, No. 4, 1700--1752 (2005; Zbl 1078.62005)

**OpenURL**

##### References:

[1] | Abramovich, F., Amato, U. and Angelini, C. (2004). On optimality of Bayesian wavelet estimators. Scand. J. Statist. 31 217–234. · Zbl 1063.62051 |

[2] | Abramovich, F. and Benjamini, Y. (1995). Thresholding of wavelet coefficients as a multiple hypotheses testing procedure. Wavelets and Statistics. Lecture Notes in Statist. 103 5–14. Springer, Berlin. · Zbl 0875.62081 |

[3] | Abramovich, F., Benjamini, Y., Donoho, D. and Johnstone, I. (2005). Adapting to unknown sparsity by controlling the false discovery rate. Ann. Statist. To appear. Available at www-stat.stanford.edu/ imj. · Zbl 1092.62005 |

[4] | Abramovich, F., Sapatinas, T. and Silverman, B. W. (1998). Wavelet thresholding via a Bayesian approach. J. R. Stat. Soc. Ser. B Stat. Methodol. 60 725–749. · Zbl 0910.62031 |

[5] | Abramovich, F. and Silverman, B. W. (1998). Wavelet decomposition approaches to statistical inverse problems. Biometrika 85 115–129. · Zbl 0908.62095 |

[6] | Antoniadis, A., Jansen, M., Johnstone, I. M. and Silverman, B. W. (2004). EbayesThresh: MATLAB software for Empirical Bayes thresholding. Available at www-lmc.imag.fr/lmc-sms/Anestis.Antoniadis/EBayesThresh. |

[7] | Benjamini, Y. and Hochberg, Y. (1995). Controlling the false discovery rate: A practical and powerful approach to multiple testing. J. Roy. Statist. Soc. Ser. B 57 289–300. · Zbl 0809.62014 |

[8] | Benjamini, Y. and Yekutieli, D. (2001). The control of the false discovery rate in multiple testing under dependency. Ann. Statist. 29 1165–1188. · Zbl 1041.62061 |

[9] | Birgé, L. and Massart, P. (2001). Gaussian model selection. J. Eur. Math. Soc. 3 203–268. · Zbl 1037.62001 |

[10] | Cai, T. T. and Silverman, B. W. (2001). Incorporating information on neighboring coefficients into wavelet estimation. Sankhyā Ser. B 63 127–148. · Zbl 1192.42020 |

[11] | Chipman, H. A., Kolaczyk, E. D. and McCulloch, R. E. (1997). Adaptive Bayesian wavelet shrinkage. J. Amer. Statist. Assoc. 92 1413–1421. · Zbl 0913.62027 |

[12] | Clyde, M. and George, E. I. (2000). Flexible empirical Bayes estimation for wavelets. J. R. Stat. Soc. Ser. B Stat. Methodol. 62 681–698. · Zbl 0957.62006 |

[13] | Clyde, M., Parmigiani, G. and Vidakovic, B. (1998). Multiple shrinkage and subset selection in wavelets. Biometrika 85 391–401. · Zbl 0938.62021 |

[14] | Cohen, A., Daubechies, I. and Vial, P. (1993). Wavelets on the interval and fast wavelet transforms. Appl. Comput. Harmon. Anal. 1 54–81. · Zbl 0795.42018 |

[15] | Coifman, R. R. and Donoho, D. L. (1995). Translation-invariant de-noising. Wavelets and Statistics. Lecture Notes in Statist. 103 125–150. Springer, Berlin. · Zbl 0866.94008 |

[16] | Daubechies, I. (1992). Ten Lectures on Wavelets . SIAM, Philadelphia. · Zbl 0776.42018 |

[17] | Delyon, B. and Juditsky, A. (1996). On minimax wavelet estimators. Appl. Comput. Harmon. Anal. 3 215–228. · Zbl 0865.62023 |

[18] | Donoho, D. L. and Johnstone, I. M. (1994). Spatial adaptation via wavelet shrinkage. Biometrika 81 425–455. · Zbl 0815.62019 |

[19] | Donoho, D. L. and Johnstone, I. M. (1995). Adapting to unknown smoothness via wavelet shrinkage. J. Amer. Statist. Assoc. 90 1200–1224. · Zbl 0869.62024 |

[20] | Donoho, D. L. and Johnstone, I. M. (1999). Asymptotic minimaxity of wavelet estimators with sampled data. Statist. Sinica 9 1–32. · Zbl 1065.62518 |

[21] | Donoho, D. L., Johnstone, I. M., Kerkyacharian, G. and Picard, D. (1995). Wavelet shrinkage: Asymptopia? (with discussion). J. Roy. Statist. Soc. Ser. B 57 301–369. · Zbl 0827.62035 |

[22] | Donoho, D. L., Johnstone, I. M., Kerkyacharian, G. and Picard, D. (1997). Universal near minimaxity of wavelet shrinkage. In Festschrift for Lucien Le Cam (D. Pollard, E. Torgersen and G. L. Yang, eds.) 183–218. Springer, Berlin. · Zbl 0891.62025 |

[23] | Efromovich, S. (1999). Quasi-linear wavelet estimation. J. Amer. Statist. Assoc. 94 189–204. · Zbl 1072.62557 |

[24] | George, E. I. and Foster, D. P. (1998). Empirical Bayes variable selection. In Proc. Workshop on Model Selection . Special Issue of Rassegna di Metodi Statistici ed Applicazioni (W. Racugno, ed.) 79–108. Pitagora Editrice, Bologna. |

[25] | George, E. I. and Foster, D. P. (2000). Calibration and empirical Bayes variable selection. Biometrika 87 731–748. · Zbl 1029.62008 |

[26] | Gopinath, R. A. and Burrus, C. S. (1992). On the moments of the scaling function \(\psi_0\). In Proc. 1992 IEEE International Symposium on Circuits and Systems 2 963–966. IEEE Press, Piscataway, NJ. · Zbl 0776.42022 |

[27] | Johnstone, I. M. (1999). Wavelet shrinkage for correlated data and inverse problems: Adaptivity results. Statist. Sinica 9 51–83. · Zbl 1065.62519 |

[28] | Johnstone, I. M. (2003). Threshold selection in transform shrinkage. In Statistical Challenges in Modern Astronomy III (E. D. Feigelson and G. J. Babu, eds.) 343–360. Springer, New York. |

[29] | Johnstone, I. M. (2004). Function estimation and Gaussian sequence models. Draft of a monograph. |

[30] | Johnstone, I. M. and Silverman, B. W. (1997). Wavelet threshold estimators for data with correlated noise. J. Roy. Statist. Soc. Ser. B 59 319–351. · Zbl 0886.62044 |

[31] | Johnstone, I. M. and Silverman, B. W. (1998). Empirical Bayes approaches to mixture problems and wavelet regression. Technical report, Dept. Statistics, Stanford Univ. |

[32] | Johnstone, I. M. and Silverman, B. W. (2004). Boundary coiflets for wavelet shrinkage in function estimation. J. Appl. Probab. 41A 81–98. · Zbl 1049.62041 |

[33] | Johnstone, I. M. and Silverman, B. W. (2004). Needles and straw in haystacks: Empirical Bayes estimates of possibly sparse sequences. Ann. Statist. 32 1594–1649. · Zbl 1047.62008 |

[34] | Johnstone, I. M. and Silverman, B. W. (2005). EbayesThresh: R programs for empirical Bayes thresholding. J. Statist. Software 12 (8) 1–38. With accompanying software and manual. |

[35] | Liang, K.-Y. and Zeger, S. L. (1986). Longitudinal data analysis using generalized linear models. Biometrika 73 13–22. · Zbl 0595.62110 |

[36] | Mallat, S. (1999). A Wavelet Tour of Signal Processing , 2nd expanded ed. Academic Press, San Diego, CA. · Zbl 0937.94001 |

[37] | Meyer, Y. (1992). Wavelets and Operators . Cambridge Univ. Press. · Zbl 0776.42019 |

[38] | Müller, P. and Vidakovic, B., eds. (1999). Bayesian Inference in Wavelet-Based Models . Lecture Notes in Statist. 141 . Springer, New York. · Zbl 0920.00017 |

[39] | Nason, G. P. (1996). Wavelet shrinkage using cross-validation. J. Roy. Statist. Soc. Ser. B 58 463–479. · Zbl 0853.62034 |

[40] | Nason, G. P. (1998). WaveThresh3 Software. Dept. Mathematics, Univ. Bristol, UK. Available from the CRAN Archive. |

[41] | Paul, D. (2004). Adaptive estimation in linear inverse problems using penalized model selection. Technical report, Dept. Statistics, Stanford Univ. |

[42] | Pensky, M. (2005). Frequentist optimality of Bayesian wavelet shrinkage rules for Gaussian and non-Gaussian noise. Ann. Statist. · Zbl 1095.62049 |

[43] | Polzehl, J. and Spokoiny, V. (2000). Adaptive weights smoothing with applications to image restoration. J. R. Stat. Soc. Ser. B Stat. Methodol. 62 335–354. · Zbl 04558575 |

[44] | Portilla, J., Strela, V., Wainwright, M. J. and Simoncelli, E. P. (2003). Image denoising using scale mixtures of Gaussians in the wavelet domain. IEEE Trans. Image Process. 12 1338–1351. · Zbl 1279.94028 |

[45] | R Development Core Team (2004). R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria. Available at www.R-project.org. |

[46] | Silverman, B. W. (1999). Wavelets in statistics: Beyond the standard assumptions. R. Soc. Lond. Philos. Trans. Ser. A Math. Phys. Eng. Sci. 357 2459–2473. · Zbl 1054.62538 |

[47] | Triebel, H. (1983). Theory of Function Spaces . Birkhäuser, Basel. · Zbl 0546.46027 |

[48] | Vidakovic, B. (1998). Wavelet-based nonparametric Bayes methods. Practical Nonparametric and Semiparametric Bayesian Statistics . Lecture Notes in Statist. 133 133–155. Springer, New York. · Zbl 0918.62038 |

[49] | Vidakovic, B. (1999). Statistical Modeling by Wavelets . Wiley, New York. · Zbl 0924.62032 |

[50] | Wainwright, M. J., Simoncelli, E. P. and Willsky, A. S. (2001). Random cascades on wavelet trees and their use in analyzing and modeling natural images. Appl. Comput. Harmon. Anal. 11 89–123. · Zbl 0983.68228 |

[51] | Zhang, C.-H. (2005). General empirical Bayes wavelet methods and exactly adaptive minimax estimation. Ann. Statist. 33 54–100. · Zbl 1064.62009 |

This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. It attempts to reflect the references listed in the original paper as accurately as possible without claiming the completeness or perfect precision of the matching.