SafePub: A Truthful Data Anonymization Algorithm With Strong Privacy Guarantees
Keywords
Data privacydifferential privacy
anonymization
disclosure control
classification
Ethics
BJ1-1725
Electronic computers. Computer science
QA75.5-76.95
Full record
Show full item recordAbstract
Methods for privacy-preserving data publishing and analysis trade off privacy risks for individuals against the quality of output data. In this article, we present a data publishing algorithm that satisfies the differential privacy model. The transformations performed are truthful, which means that the algorithm does not perturb input data or generate synthetic output data. Instead, records are randomly drawn from the input dataset and the uniqueness of their features is reduced. This also offers an intuitive notion of privacy protection. Moreover, the approach is generic, as it can be parameterized with different objective functions to optimize its output towards different applications. We show this by integrating six well-known data quality models. We present an extensive analytical and experimental evaluation and a comparison with prior work. The results show that our algorithm is the first practical implementation of the described approach and that it can be used with reasonable privacy parameters resulting in high degrees of protection. Moreover, when parameterizing the generic method with an objective function quantifying the suitability of data for building statistical classifiers, we measured prediction accuracies that compare very well with results obtained using state-of-the-art differentially private classification algorithms.Date
2018-01-01Type
ArticleIdentifier
oai:doaj.org/article:a20b325853374e22ae9ba8990017c8792299-0984
10.1515/popets-2018-0004
https://doaj.org/article/a20b325853374e22ae9ba8990017c879