zbMATH — the first resource for mathematics

Elements of statistical disclosure control. (English) Zbl 0973.62009
Lecture Notes in Statistics. 155. New York, NY: Springer. xv, 261 p. (2001).
This book is devoted to a problem which is opposite to the most sampling survey problems: how to publish information on statistical data and disclose no confidential information. The discipline which deals with such problems is called Statistical Disclosure Control (SDC). It is supposed that the data to be published contain personal information of private nature (sensitive variables, e.g.: ”opinion about legislation of drugs”) and information which can be used for re-identification of the respondent (identifying variables, e.g. occupation, date of birth, place of residence). The problem is to publish maximum information useful for statisticians and minimize the risk of unique identification of sensitive variable values for any respondent. To do this one can use some transformations of identifying variables, such as global recoding (grouping, e.g. use ”age in years” instead of ”date of birth”) or local suppressing (e.g. change the value 16 of the variable ”number of children” to ”missing”). In multidimensional data possible combinations of identifying variables should be taken into account.
The authors consider different measures of quality of data preserving transformations based on the risk of re-identification and information loss, describe techniques of optimal protection transformation selection (these transformations include global recoding, local suppressing, adding noise, rounding, post randomization and data swapping) for microdata and tabular data. Special SDC dedicated software as \(\mu\)- and \(\tau\)-ARGUS and SDC practice of Statistics in the Netherlands are considered.

62D05 Sampling theory, sample surveys
62-02 Research exposition (monographs, survey articles) pertaining to statistics
68P25 Data encryption (aspects in computer science)