kACTUS 2: Privacy preserving in classification tasks using k-Anonymity

Slava Kisilevich, Yuval Elovici, Bracha Shapira, Lior Rokach

Protecting Persons While Protecting the People: Second Annual Workshop on …, 2009

k-anonymity is the method used for masking sensitive data which successfully solves the problem of re-linking of data with an external source and makes it difficult to re-identify the individual. Thus k-anonymity works on a set of quasi-identifiers (public sensitive attributes), whose possible availability and linking is anticipated from external dataset, and demands that the released dataset will contain at least k records for every possible quasi-identifier value. Another aspect of k is its capability of maintaining the truthfulness of the released data (unlike other existing methods). This is achieved by generalization, a primary technique in k-anonymity. Generalization consists of generalizing attribute values and substituting them with semantically consistent but less precise values. When the substituted value doesn’t preserve semantic validity the technique is called suppression which is a private case of …