A Study of Perturbation Techniques for Data Privacy
Cynthia Dwork and Nina
Class Schedule: Tuesdays, 1-3:00pm, Gates 259
Office Hours: email to schedule
Number of Units: 2
The digital age has enabled widespread access to and collection of data. While there are several advantages to ubiquitous access to data, there is also the potential for breaching the privacy of individuals.
In a statistical database, personal information about n individuals is typically stored (n is usually very large). A statistical database system gives users the ability to obtain aggregate statistical information (like medians, averages, counts) and yet also preserve the privacy of individuals. Typical applications include medical, financial, and census data.
The course will study techniques for simultaneously enabling access to aggregate data and preserving privacy. Data perturbation is a classical technique for solving this problem. There are two flavors of data perturbation. In one version, the data are perturbed once, and the perturbed values are published. In the second version, the data are held secret; the database algorithm computes the true response to queries, and adds noise to the answer, reporting only the noisy answer. Both versions of the problem have a rich literaure.
The goal of the course is to
- understand the different definitions of data privacy
- understand the techniques for achieving privacy
- assess to what extent the suggested measures actually provide data privacy
- suggest new definitions of privacy
- suggest new algorithms for providing privacy
Background in probability, statistics, cryptography, and algorithms would be helpful.
The course can only be taken pass/fail and not for a letter grade. There will not be any exams. The grade in the class will be based on a presentation done the last day of class (June 1st) on a paper selected from the list below. Please communicate which paper(s) you plan to present by May 27, 2004. If you'd like to discuss a paper that's not on the list, please email the instructors.
||Introduction to Privacy -
Introduction to Cryptography, Secure Function Evaluation
||Introduction to Cryptography,
|| Query Auditing
||Sampling Contingency Tables
||Cell Suppression||Nina Mishra|
Limits of Perturbation, Output Perturbation
||Output Perturbation||Kobbi Nissim|
||Students discuss their writeups
Secure Function Evaluation
Privacy Preserving Data Mining on Vertically Partitioned Databases. C. Dwork and K. Nissim. Manuscript. 2004.