zbMATH — the first resource for mathematics

Model based disclosure protection. (English) Zbl 1051.68774
Domingo-Ferrer, Josep (ed.), Inference control in statistical databases. From theory to practice. Berlin: Springer (ISBN 3-540-43614-6). Lect. Notes Comput. Sci. 2316, 83-96 (2002).
Summary: We argue that any microdata protection strategy is based on a formal reference model. The extent of model specification yields “parametric”, “semiparametric”, or “nonparametric” strategies. Following this classification, a parametric probability model, such as a normal regression model, or a multivariate distribution for simulation can be specified. Matrix masking, covering local suppression, coarsening, microaggregation, noise injection, perturbation, provides examples of the second and third class of models. Finally, a nonparametric approach, e.g. use of bootstrap procedures for generating synthetic microdata can be adopted. In this paper we discuss the application of a regression based imputation procedure for business microdata to the Italian sample from the Community Innovation Survey. A set of regressions is used for generating flexible perturbation, for the protection varies according to identifiability of the enterprise; a spatial aggregation strategy is also proposed, based on principal components analysis. The inferential usefulness of the released data and the protection achieved by the strategy are evaluated.
For the entire collection see [Zbl 0992.68514].

68U99 Computing methodologies and applications
68P15 Database theory
Full Text: Link