Skip to content (access key 's')
Logo of Technion
Logo of CS Department
Logo of CS4People
Events

The Taub Faculty of Computer Science Events and Talks

Theory Seminar: When and How Can Data be Efficiently Released with Privacy?
event speaker icon
Guy Rothblum (M.I.T.)
event date icon
Sunday, 04.01.2009, 12:00
event location icon
Taub 6
We consider private data analysis in the setting in which a trusted and trustworthy curator, having obtained a large data set containing private information, releases to the public a "sanitization'' of the data set that simultaneously protects the privacy of the individual contributors of data and offers utility to the data analyst. We focus on the case where the process is non-interactive; once the sanitization has been released the original data and the curator play no further role.

Blum et al. [STOC '08] showed a remarkable result: for any collection of counting predicate queries, the exponential mechanism of McSherry and Talwar [FOCS '07] can be used to (inefficiently) generate a synthetic data set that maintains approximately correct fractional counts for all of the queries, while ensuring a strong privacy guarantee. In this work we investigate the computational complexity of such non-interactive privacy mechanisms, mapping the boundary between feasibility and infeasibility. We show: 1. For any polynomial-size query set C and polynomial-size data universe, an efficient algorithm for generating a privacy-preserving synthetic data-set that maintains approximate fractional counts, even when the size of the original dataset is sub-polynomial in |C|. 2. When either the query set or the data universe are super-polynomial, assuming one-way functions exist, there is no efficient method for releasing a privacy-preserving synthetic data-set that maintains approximate fractional counts. This also indicates that it is unlikely that the exponential mecahnism can, in general, be implemented efficiently. 3. Turning to the potentially easier problem of privately generating an arbitrary data structure (not necessarily synthetic data) from which it is possible to approximate counts, there is a tight connection between hardness of sanitization and the existence of traitor tracing schemes.

Joint work with Cynthia Dwork, Moni naor, Omer Reingold and Salil Vadhan.