×

The effect of microaggregation by individual ranking on the estimation of moments. (English) Zbl 1431.62666

Summary: Microaggregation by individual ranking (IR) is an important technique for masking confidential econometric data. While being a successful method for controlling the disclosure risk of observations, IR also affects the results of statistical analyses. We conduct a theoretical analysis on the estimation of arbitrary moments from a data set that has been anonymized by means of the IR method. We show that classical moment estimators remain both consistent and asymptotically normal under weak assumptions. This theory provides the justification for applying standard statistical estimation techniques to the anonymized data without having to correct for a possible bias caused by anonymization.

MSC:

62P20 Applications of statistics to economics
62F07 Statistical ranking and selection procedures
PDFBibTeX XMLCite
Full Text: DOI

References:

[1] Aggarwal, C. C.; Yu, P. S., Privacy-preserving Data Mining: Models and Algorithms (2008), Springer: Springer New York
[2] Defays, D.; Anwar, M. N., Masking microdata using microaggregation, Journal of Official Statistics, 14, 4, 449-461 (1998)
[3] Domingo-Ferrer, J.; Martinez-Balleste, A.; Mateo-Sanz, J. M.; Sebe, F., Efficient multivariate data-oriented microaggregation, International Journal on Very Large Data Bases, 15, 4, 355-369 (2006)
[4] Domingo-Ferrer, J.; Mateo-Sanz, J. M., Practical data-oriented microaggregation for statistical disclosure control, IEEE Transactions on Knowledge and Data Engineering, 14, 1, 189-201 (2002)
[5] Domingo-Ferrer, J.; Oganian, A.; Torres, A.; Mateo-Sanz, J. M., On the security of microaggregation with individual ranking: Analytical attacks, International Journal of Uncertainty, Fuzziness and Knowledge-Based Systems, 10, 5, 477-491 (2002) · Zbl 1084.68523
[6] Domingo-Ferrer, J.; Sebe, F.; Solanas, A., A polynomial-time approximation to optimal multivariate microaggregation, Computers & Mathematics with Applications, 55, 4, 714-732 (2008) · Zbl 1145.94012
[7] Domingo-Ferrer, J.; Torra, V., A quantitative comparison of disclosure control methods for microdata, (Doyle, P.; Lane, J.; Theeuwes, J.; Zayatz, L., Confidentiality, Disclosure, and Data Access (2001), North-Holland: North-Holland Amsterdam), 111-133
[8] Domingo-Ferrer, J.; Torra, V., Privacy in Statistical Databases (2004), Springer: Springer Berlin
[9] Doyle, P.; Lane, J.; Theeuwes, J.; Zayatz, L., Confidentiality, Disclosure, and Data Access (2001), North-Holland: North-Holland Amsterdam
[10] Durrett, R., Probability: Theory and Examples (1991), Wadsworth & Brooks/Cole: Wadsworth & Brooks/Cole Pacific Grove · Zbl 0709.60002
[11] Felsö, F.; Theeuwes, J.; Wagner, G. G., Disclosure limitation methods in use: Results of a survey, (Doyle, P.; Lane, J.; Theeuwes, J.; Zayatz, L., Confidentiality, Disclosure, and Data Access (2001), North-Holland: North-Holland Amsterdam), 17-42
[12] Fritsch, M.; Stephan, A., Die Heterogenität der technischen Effizienz innerhalb von Wirtschaftszweigen — Auswertungen auf Grundlage der Kostenstrukturstatistik des Statistischen Bundesamtes, (Pohl, R.; Fischer, J.; Rockmann, U.; Semlinger, K., Analysen zur regionalen Industrieentwicklung — Sonderauswertungen einzelbetrieblicher Daten der amtlichen Statistik (2003), Statistisches Landesamt: Statistisches Landesamt Berlin), 143-156, (in German)
[13] Hundepool, A.; Domingo-Ferrer, J.; Franconi, L.; Giessing, S.; Lenz, R.; Longhurst, J.; Schulte Nordholt, E.; Seri, G.; De Wolf, P.-P., Handbook on Statistical Disclosure Control (2009)
[14] Laszlo, M.; Mukherjee, S., Minimum spanning tree partitioning algorithm for microaggregation, IEEE Transactions on Knowledge and Data Engineering, 17, 7, 902-911 (2005)
[15] Martinez-Balleste, A.; Solanas, A.; Domingo-Ferrer, J.; Mateo-Sanz, J. M., A genetic approach to multivariate microaggregation for database privacy, (23rd International Conference on Data Engineering. Workshop on Privacy Data Management (2007), IEEE Computer Society), 180-185
[16] Mateo-Sanz, J. M.; Domingo-Ferrer, J., A comparative study of microaggregation methods, Questiio, 22, 3, 511-526 (1998) · Zbl 1167.62438
[17] Ronning, G.; Sturm, R.; Höhne, J.; Lenz, R.; Rosemann, M.; Scheffler, M.; Vorgrimler, D., Handbuch zur Anonymisierung wirtschaftsstatistischer Mikrodaten, (Statistik und Wissenschaft, vol. 4 (2005), Statistisches Bundesamt: Statistisches Bundesamt Wiesbaden), (in German)
[18] Rosemann, M.; Lenz, R.; Vorgrimler, D.; Sturm, R., Anonymising business micro data — Results of a German project, Journal of Applied Social Sciences Studies, 126, 4, 635-651 (2006)
[19] Schmid, M., Estimation of a linear model under microaggregation by individual ranking, Journal of the German Statistical Society, 90, 3, 419-438 (2006) · Zbl 1109.62057
[20] Schmid, M.; Schneeweiss, H., Estimation of a linear model in transformed variables under microaggregation by individual ranking, AStA Advances in Statistical Analysis, 92, 4, 359-374 (2008) · Zbl 1477.62177
[21] Schmid, M.; Schneeweiss, H.; Küchenhoff, H., Estimation of a linear regression under microaggregation with the response variable as a sorting variable, Statistica Neerlandica, 61, 4, 407-431 (2007) · Zbl 1149.62318
[22] Strudler, M.; Oh, H. L.; Scheuren, F., Protection of taxpayer confidentiality with respect to the tax model, Proceedings of the Section on Survey Research Methods of the American Statistical Association, 375-381 (1986)
[23] UNECE Secretariat, 2001. Statistical data confidentiality in the transition countries: 2000/2001 winter survey. In: Joint ECE/Eurostat Work Session on Statistical Data Confidentiality, Skopje, Macedonia; UNECE Secretariat, 2001. Statistical data confidentiality in the transition countries: 2000/2001 winter survey. In: Joint ECE/Eurostat Work Session on Statistical Data Confidentiality, Skopje, Macedonia
[24] Willenborg, L.; de Waal, T., Elements of Statistical Disclosure Control (2001), Springer: Springer New York · Zbl 0973.62009
[25] Winkler, W.E., 2002. Single-ranking micro-aggregation and reidentification. Statistical Research Division Report RR 2002/08, U.S. Bureau of the Census, Washington; Winkler, W.E., 2002. Single-ranking micro-aggregation and reidentification. Statistical Research Division Report RR 2002/08, U.S. Bureau of the Census, Washington
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.