×

Bounds on the sample complexity for private learning and private data release. (English) Zbl 1311.68131

The paper is very opportune in the current context of needing to protect privacy when mining data. The authors had previously designated the concept of private learning by combining probably approximately correct learning with differential privacy. This implies learning while preserving confidentiality on the sensitive information.
Nevertheless, the current study moves one step further by giving tight bounds on the sample complexity in private learning, as opposed to the non-private one. Sample complexities of proper and improper private learners are strongly separated, as well as those for efficient and inefficient proper private learning.
Finally, the article investigates the relationship (as bounds are concerned) between private learning and sanitized data (privacy protection besides useful information).

MSC:

68T05 Learning and adaptive systems in artificial intelligence

Software:

SuLQ
PDFBibTeX XMLCite
Full Text: DOI

References:

[1] Beimel, A.; Kasiviswanathan, S. P.; Nissim, K.; Micciancio, D. (ed.), Bounds on the sample complexity for private learning and private data release, No. 5978, 437-454 (2010), Berlin · Zbl 1274.94038
[2] Beimel, A.; Nissim, K.; Stemmer, U., Characterizing the sample complexity of private learners, 97-110 (2013) · Zbl 1362.68119 · doi:10.1145/2422436.2422450
[3] Blum, A.; Dwork, C.; McSherry, F.; Nissim, K., Practical privacy: the SuLQ framework, 128-138 (2005), New York
[4] Blum, A.; Ligett, K.; Roth, A., A learning theory approach to non-interactive database privacy, 609-618 (2008), New York · Zbl 1231.68120
[5] Blum, A., Ligett, K., & Roth, A. (2013). A learning theory approach to non-interactive database privacy. Journal of the ACM, 60(2), 12. · Zbl 1281.68092 · doi:10.1145/2450142.2450148
[6] Blumer, A., Ehrenfeucht, A., Haussler, D., & Warmuth, M. K. (1989). Learnability and the Vapnik-Chervonenkis dimension. Journal of the ACM, 36(4), 929-965. · Zbl 0697.68079 · doi:10.1145/76359.76371
[7] Chaudhuri, K., & Hsu, D. (2011). Sample complexity bounds for differentially private learning. Journal of Machine Learning Research, 19, 155-186.
[8] Chaudhuri, K.; Monteleoni, C.; Koller, D. (ed.); Schuurmans, D. (ed.); Bengio, Y. (ed.); Bottou, L. (ed.), Privacy-preserving logistic regression (2008), Cambridge
[9] Chaudhuri, K., Monteleoni, C., & Sarwate, A. D. (2011). Differentially private empirical risk minimization. Journal of Machine Learning Research, 12, 1069-1109. · Zbl 1280.62073
[10] Chernoff, H. (1952). A measure of asymptotic efficiency for tests of a hypothesis based on the sum of observations. The Annals of Mathematical Statistics, 23, 493-507. · Zbl 0048.11804 · doi:10.1214/aoms/1177729330
[11] Dwork, C.; Reingold, O. (ed.), The differential privacy frontier, No. 5444, 496-502 (2009), Berlin · Zbl 1213.94156
[12] Dwork, C. (2011). A firm foundation for private data analysis. Communications of the ACM, 54(1), 86-95. · doi:10.1145/1866739.1866758
[13] Dwork, C.; McSherry, F.; Nissim, K.; Smith, A.; Halevi, S. (ed.); Rabin, T. (ed.), Calibrating noise to sensitivity in private data analysis, No. 3876, 265-284 (2006), Berlin · Zbl 1112.94027
[14] Dwork, C.; Naor, M.; Reingold, O.; Rothblum, G.; Vadhan, S., On the complexity of differentially private data release, 381-390 (2009), New York · Zbl 1304.94050
[15] Ehrenfeucht, A., Haussler, D., Kearns, M. J., & Valiant, L. G. (1989). A general lower bound on the number of examples needed for learning. Information and Computation, 82(3), 247-261. · Zbl 0679.68158 · doi:10.1016/0890-5401(89)90002-3
[16] Goldreich, O. (2001). Foundations of cryptography, volume basic tools. Cambridge: Cambridge University Press. · Zbl 1007.94016 · doi:10.1017/CBO9780511546891
[17] Hoeffding, W. (1963). Probability inequalities for sums of bounded random variables. Journal of the American Statistical Association, 58(301), 13-30. · Zbl 0127.10602 · doi:10.1080/01621459.1963.10500830
[18] Hughes, D. R., & Piper, F. C. (1973). Projective planes (Vol. 6). Berlin: Springer. · Zbl 0267.50018
[19] Kasiviswanathan, S. P., Lee, H. K., Nissim, K., Raskhodnikova, S., & Smith, A. (2011). What can we learn privately? SIAM Journal on Computing, 40(3), 793-826. · Zbl 1235.68093 · doi:10.1137/090756090
[20] Kearns, M. J. (1998). Efficient noise-tolerant learning from statistical queries. Journal of the ACM, 45(6), 983-1006. Preliminary version in proceedings of STOC’93. · Zbl 1065.68605 · doi:10.1145/293347.293351
[21] Kearns, M. J., & Vazirani, U. V. (1994). An introduction to computational learning theory. Cambridge: MIT Press.
[22] Kifer, D., Smith, A. D., & Thakurta, A. (2012). Private convex optimization for empirical risk minimization with applications to high-dimensional regression. Journal of Machine Learning Research, 23, 25.
[23] McSherry, F.; Talwar, K., Mechanism design via differential privacy, 94-103 (2007), New York
[24] Mishra, N.; Sandler, M., Privacy via pseudorandom sketches, 143-152 (2006), New York
[25] Pitt, L., & Valiant, L. G. (1988). Computational limitations on learning from examples. Journal of the ACM, 35(4), 965-984. · Zbl 0662.68086 · doi:10.1145/48014.63140
[26] Valiant, L. G. (1984). A theory of the learnable. Communications of the ACM, 27, 1134-1142. · Zbl 0587.68077 · doi:10.1145/1968.1972
[27] Vapnik, V. N., & Chervonenkis, A. Y. (1971). On the uniform convergence of relative frequencies of events to their probabilities. Theory of Probability and Its Applications, 16, 264. · Zbl 0247.60005 · doi:10.1137/1116025
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.