×

Parametric Bayesian estimation of differential entropy and relative entropy. (English) Zbl 1229.62030

Summary: Given iid samples drawn from a distribution with known parametric form, we propose the minimization of expected Bregman divergence to form Bayesian estimates of differential entropy and relative entropy, and derive such estimators for the uniform, Gaussian, Wishart, and inverse Wishart distributions. Additionally, formulas are given for a log gamma Bregman divergence and the differential entropy and relative entropy for the Wishart and inverse Wishart. The results, as always with Bayesian estimates, depend on the accuracy of the prior parameters, but example simulations show that the performance can be substantially improved compared to maximum likelihood or state-of-the-art nonparametric estimators.

MSC:

62F15 Bayesian inference
62B10 Statistical aspects of information-theoretic topics
62F10 Point estimation
94A17 Measures of information, entropy
PDFBibTeX XMLCite
Full Text: DOI

References:

[1] DOI: 10.1109/TIM.2006.887174 · doi:10.1109/TIM.2006.887174
[2] Choi, Adaptive Sampling and Forecasting with Mobile Sensor Networks , PhD Dissertation (2009)
[3] DOI: 10.1016/0165-1684(89)90132-1 · doi:10.1016/0165-1684(89)90132-1
[4] DOI: 10.1016/j.jmva.2003.10.003 · Zbl 1073.62002 · doi:10.1016/j.jmva.2003.10.003
[5] DOI: 10.1109/TIT.2009.2016060 · Zbl 1367.94141 · doi:10.1109/TIT.2009.2016060
[6] DOI: 10.1109/18.30996 · Zbl 0672.62005 · doi:10.1109/18.30996
[7] Beirlant, Nonparametric entropy estimation: An overview, Intl. J. Math. Stat. Sci. 6 pp 17– (1997) · Zbl 0882.62003
[8] DOI: 10.1109/TIT.2007.899533 · Zbl 1326.94029 · doi:10.1109/TIT.2007.899533
[9] Kozachenko, Sample estimate of entropy of a random vector, Probl. Inform. Transm. 23 pp 95– (1987) · Zbl 0633.62005
[10] DOI: 10.1080/104852504200026815 · Zbl 1061.62005 · doi:10.1080/104852504200026815
[11] DOI: 10.3103/S106653070803006X · Zbl 1231.62047 · doi:10.3103/S106653070803006X
[12] DOI: 10.1109/18.782114 · Zbl 0945.90073 · doi:10.1109/18.782114
[13] DOI: 10.1109/MSP.2002.1028355 · doi:10.1109/MSP.2002.1028355
[14] DOI: 10.1109/TSP.2004.831130 · Zbl 1369.68278 · doi:10.1109/TSP.2004.831130
[15] DOI: 10.1162/0899766054323026 · Zbl 1076.62013 · doi:10.1162/0899766054323026
[16] DOI: 10.1109/TIT.2005.853314 · Zbl 1310.94055 · doi:10.1109/TIT.2005.853314
[17] Nguyen, Estimating divergence functional and the likelihood ratio by penalized convex risk minimization, Advances Neural Inform. Process. Syst. (2007)
[19] Pérez-Cruz, Estimation of information-theoretic measures for continuous random variables, Adv. Neural Inform. Process. Syst. (NIPS) (2009)
[20] Lehmann, Theory of Point Estimation (1998) · Zbl 0916.62017
[21] DOI: 10.1016/0041-5553(67)90040-7 · doi:10.1016/0041-5553(67)90040-7
[22] DOI: 10.1109/TIT.2005.850145 · Zbl 1284.94025 · doi:10.1109/TIT.2005.850145
[23] DOI: 10.1109/18.50370 · Zbl 0731.62016 · doi:10.1109/18.50370
[24] DOI: 10.1109/TIT.2008.929943 · Zbl 1319.62137 · doi:10.1109/TIT.2008.929943
[25] Amari, Methods of Information Geometry (2000) · Zbl 0960.62005
[26] DOI: 10.1214/ss/1177012480 · Zbl 0955.62513 · doi:10.1214/ss/1177012480
[27] Srivastava, Bayesian quadratic discriminant analysis, J. Mach. Learn. Res. 8 pp 1287– (2007) · Zbl 1222.62043
[28] Havil, Gamma (2003)
[29] DOI: 10.1109/78.845926 · doi:10.1109/78.845926
[30] Bilodeau, Theory of Multivariate Statistics (1999)
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.