Privacy and Databases
About People Schedule Publications Thesis Readings Links Contact
This project is focused on the issue of providing privacy guarantees in databases and data mining. This is an early version of the website, focused on the interests of Hector/Rajeev and their research groups in this area. It is part of a larger NSF ITR project called PORTIA (Privacy, Obligations, and Rights in Technologies of Information Assessment) involving other faculty at Stanford (Dan Boneh and John Mitchell), Yale (Joan Feigenbaum, Ravi Kannan, and Avi Silberschatz), Microsoft Research (Cynthia Dwork), University of New Mexico (Stephanie Forrest), NYU (Helen Nissenbaum), Stevens Inst. of Technology (Rebecca Wright), as well as folks at government agencies and in the industry.
Faculty: Hector
Garcia-Molina
Rajeev
Motwani
Visitors: Nina Mishra
Students:
Bob Mungamuru
Shubha Nabar
Ying Xu
Alumni:
Gagan Aggarwal
Krishnaram Kenthapadi
Mayank Bawa
Prasanna Ganesan
Dilys Thomas
Dilys Thomas is the keeper of our schedule of meetings. If you wish to give a talk, please contact him.
Books, Surveys, Talks, Tutorials, and Workshops
Summaries from Working Groups, IBM Almaden Institute on Privacy in Data Systems. 2003.
Book: Database Security. S. Castano, M. Fugini, G. Martella, and P. Samarati. Addison-Wesley (1995).
Book: Confidentiality, Disclosure, and Data Access: Theory and Practical Applications for Statistical Agencies. L. Zayatz, P. Doyle, J. Theeuwes and J. Lane (eds), Urban Institute, Washington, DC, 2001.
Tutorial: Privacy, Security, and Data Mining. C. Clifton. ECML and PKDD 2002. (Briefing in Powerpoint and PDF).
Tutorial: Privacy-Enhanced Data Management for Next-Generation e-Commerce. C. Clifton, I. Fundulaki, and A. Sahuguet. VLDB 2003.
Privacy Preserving Data Mining: Challenges and Opportunities, R. Srikant, Plenary Talk at the Sixth Pacific-Asia Conf. on Knowledge Discovery and Data Mining 2002.
Database Security: Status and Prospects. S. Jajodia. IBM Almaden Institute on Privacy in Data Systems, 2003.
Mathematics and the Privacy Laws (abstract) (ppt) (html) (pdf) M. Shamos. ALADDIN Workshop on Privacy in Data, 2003.
Privacy in the Information Age: A National Academies Study. J. Waldo. IBM Almaden Institute on Privacy in Data Systems, 2003.
'I Didn't Buy it for Myself': Privacy and Ecommerce Personalization, L. Cranor. ACM Workshop on Privacy in the Electronic Society 2003.
Defining Privacy for Data Mining. C.
Clifton, M. Kantarcioglu, and J. Vaidya.
Chapter in Next
Generation Data Mining,
AAAI/MIT Press, to appear.
Video: Privacy
Preserving Data Mining
. C. Clifton. CERIAS Security Seminar.
Workshop on Privacy, Security, and Data Mining held at International Conference on Data Mining, 2002.
DBSec (IFIP Working Conference on Database and Application Security).
Applications and Media Articles
Privacy-Preserving Distributed Queries for a Clinical Case Research Network. G. Schadow, S.J. Grannis, and C.J. McDonald. Workshop on Privacy, Security, and Data Mining 2002.
Northwest gave U.S. data on passengers. MSNBC News (1/18/04).
Northwest Airlines' Disclosure of Passenger Data to NASA. EPIC.
The Complexity Underlying JetBlue��s Privacy Policy Violations. A. I. Ant��n, Q. He, and D. L. Baumer (2003).
U.S. Calls Release of JetBlue Data Improper. New York Times (2/21/04).
Senators Question Rumsfeld on Privacy Act Violations in JetBlue Case. CDT Policy Post (10/23/03).
Army Admits Using JetBlue Data. Wired News (9/23/03).
JetBlue violates privacy policy. CNN.com (9/19/03).
Putting Controls On Fed's Automated Intelligence Gathering. Information Week (4/30/03).
Data privacy in a wired world. Red Herring (1/22/03).
IBM researcher eyes databases with a conscience. NetworkWorldFusion (8/27/02).
IBM unveils Web privacy work. Computerworld (5/31/02).
Privacy and human rights. D. Banisar. Electronic Privacy Information Center (2000).
Freebies and privacy: What net users think. A. Westin. Technical Report, Opinion Research Corporation, 1999.
The End of Privacy. The Economist (4/99).
Data Mining: Staking a Claim on Your Privacy. Office of the Information and Privacy Commissioner, Ontario (1998).
Privacy Matters. Wired News (Special Collection).
Crypto Gram Newsletter. Bruce Scheneier.
Defining Privacy
Towards a theory of variable privacy. P. Vora. 2003. (slides in PDF)
Databases and Data Integration
Database Encryption in Oracle 9i. Oracle Corporation (2001).
Replacing Personally-Identifying Information in Medical Records, the Scrub System. L. Sweeney. In: Cimino, JJ, ed. Proceedings, Journal of the American Medical Informatics Association. Washington, DC: Hanley & Belfus, Inc., 1996.
Cryptography and Relational Database Management Systems. J. He and M. Wang. IDEAS 2001.
Practical Techniques for Searches on Encrypted Data. D.X. Song, D. Wagner, and A. Perrig. IEEE Symposium on Research in Security and Privacy 2002.
Executing SQL over Encrypted Data in the Database-Service-Provider Model. H. Hacigumus, B. Iyer, C. Li, and S. Mehrotra. SIGMOD 2002.
Providing Database as a Service. H. Hacigumus, B. Iyer, and S. Mehrotra. ICDE 2002.
Hippocratic Databases. R. Agrawal, J. Kiernan, R. Srikant, and Y. Xu. VLDB 2002.
Information Sharing across Private Databases. R. Agrawal, A. Evfimievski, and R. Srikant. SIGMOD 2003.
Implementing P3P Using Database Technology. R.Agrawal, J. Kiernan, R. Srikant, and Y. Xu. ICDE 2003.
Server Centric P3P. R. Agrawal, J. Kiernan, R. Srikant, and Y. Xu. W3C Workshop on the Future of P3P (2002).
Cardinality-based Inference Control in Data Cubes. L. Wang, D. Wijesekera, and S. Jajodia. Journal of Computer Security (to appear).
OLAP Means On-line Anti-Privacy. L. Wang, D. Wijesekera, and S. Jajodia. ISE Technical Report (2003).
Data Mining
Privacy-Preserving Data Mining, R. Agrawal and R. Srikant. SIGMOD 2000.
Privacy Preserving Clustering By Data Transformation. Stanley R. M. Oliveira and Osmar R. Za����ane.
Privacy Preserving Mining of Association Rules, A. Evfimievski, R. Srikant, R. Agrawal, and J. Gehrke. SIGKDD 2002.
Maintaining Data Privacy in Association Rule Mining. S.J. Rizvi and J.R. Haritsa. VLDB 2002.
Limiting Privacy Breaches in Privacy Preserving Data Mining. A. Evfimievski, J. Gehrke, and R. Srikant. PODS 2003.
Privacy Preserving Data Mining, Y. Lindell and B. Pinkas, Journal of Cryptology 13 (2002).
Tools
for Privacy Preserving Distributed Data Mining
, C. Clifton, M.
Kantarcioglu, J. Vaidya, X. Lin, and M. Zhu. ACM SIGKDD Explorations 4
(2003).
Randomization in Privacy-Preserving Data Mining A. Evfimievski. ACM SIGKDD Explorations 4 (2003).
Security and Privacy Implications of Data Mining, C. Clifton and D. Marks, DMKD 1996.
Developing Custom Intrusion Detection Filters Using Data Mining.C. Clifton and G. Gengo. MILCOM 2000.
Using Sample Size to Limit Exposure to Data Mining. C. Clifton.Journal of Computer Security 8 (2000).
Using Unknowns to Prevent Discovery of Association Rules. Y. Saygin, V.S. Verykios, and C. Clifton.ACM SIGMOD Record 30 (2001).
Privacy Preserving Association Rule Mining in Vertically Partitioned Data.J. Vaidya and C. Clifton. KDD 2002.
Privacy-Preserving K-Means Clustering over Vertically Partitioned Data. J. Vaidya and C. Clifton.KDD 2003.
Assuring Privacy when Big Brother is Watching.M. Kantarcioglu and C. Clifton. DMKD 2003.
A new architecture for Privacy Preserving Data Mining.M. Kantarcioglu and J. Vaidya. Privacy, Security and Data Mining, vol. 14, ACS Series Conferences in Research and Practice in Information Technology.
Secure Set Intersection Cardinality with Application to Association Rule Mining, C. Clifton and J. Vaidya. Under Review (Notes on use).
Privacy Preserving Data Mining of Association Rules on Horizontally Partitioned Data. M. Kantarcioglu and C. Clifton.IEEE TKDE (to appear).
Statistical Databases
Security and Disclosure for Statistical Information. A. Westlake. IBM Almaden Institute on Privacy in Data Systems, 2003.
Preserving Confidentiality AND Providing Adequate Data for Statistical Modeling: The Role of Partial and Perturbed Data (abstract) (ppt) (html) (pdf) S. Fienberg. ALADDIN Workshop on Privacy in Data, 2003.
Protecting data through 'Perturbation' Techniques: Impact on the knowledge discovery process. R.L. Wilson and P. A. Rosen. Journal of Database Management, 14(2), 14-26, April-June 2003.
Revealing information while preserving privacy (abstract) (ppt) (html) (pdf) K Nissim. ALADDIN Workshop on Privacy in Data, 2003.
On the Privacy of Statistical Databases. I. Dinur and K. Nissim.
Revealing Information while Preserving Privacy. I. Dinur and K. Nissim, PODS 2003.
Security-Control Methods for Statistical Databases: A Comparative Study. N.R. Adam and J.C. Wortmann. ACM Computing Surveys 21(1989).
A data distortion by probability distribution. C.K. Liew, U.J. Choi, and C.J. Liew. TODS 10 (1985).
The statistical security of a statistical database. J.F. Traub, Y. Yemini, and H. Woʦniakowski. TODS 9 (1984).
Statistical databases: Characteristics, problems and some solutions. A. Shoshani. VLDB 82.
Statistical Database Design. F. Chin and G. Ozsoyoglu. TODS 6 (1981).
Auditing for secure statistical databases. F. Chin and G. Ozsoyoglu. ACM 81 Conference (1981).
Suppression methodology and statistical disclosure control. L.H. Cox. JASA 75 (1980).
A Security Model for the Statistical Database Problem. D.E. Denning. TODS 5 (1980).
Secure statistical databases with random sample queries. D.E. Denning. TODS 5 (1980).
A fast procedure for finding a tracker in a statistical database. D.E.Denning and Jan Schl��rer. TODS 5 (1980).
The tracker: a threat to statistical database security. D.E Denning, P.J. Denning, and M. Schwartz. TODS 4 (1979).
Private Information Retrieval
Survey Talk: Private Information Retrieval. A. Beimal (2003).
Private Information Retrieval - An overview and current trends. D. Asonov. INDOCRYPT 2001.
Private Information Retrieval. B. Chor, O. Goldreich, E. Kushilevitz, and M. Sudan, FOCS 95.
Computationally private information retrieval. B. Chor and N. Gilboa. STOC 97.
Background in Cryptography and Security
Security D.E. Denning and P.J. Denning. Computing Surveys 11 (1979).
Privacy in P3P. (Chapter 1) Lorrie Cranor,
Protocols for secure computation. A. Yao. FOCS 1982.
How to generate and exchange secrets. A. Yao. FOCS 1986.
Studies in Secure Multiparty Computation and Applications. R. Canetti. Ph.D. Thesis, Weizmann Institiute of Science, Israel, 1995.
Comparing information without leaking it. R. Fagin, M. Naor, and P. Winkler. CACM 39 (1996).
Privacy and Computation Papers - E. Kushilevitz and C. Cachin.
Secure Multi-Party Computation. O. Goldreich. Working Draft (2001).
Secure Multiparty Computation of Approximations. J. Feigenbaum, Y. Ishai, T. Malkin, K. Nissim, M. Strauss, and R. Wright. ICALP 2001.
A Fair and Efficient Solution to the Socialist Millionaires�� Problem. F. Boudot, B. Schoenmakers, and J. Traore. Discrete Applied Mathematics 111 (2001).
Secure computation (A Survey) (abstract) (ppt) (html) (pdf). J. Kilian. ALADDIN Workshop on Privacy in Data, 2003.
Links (Relevant People, Projects,and Courses)
ALADDIN Workshop on Privacy in Data
K. Muralidhar's papers on Data Masking and Privacy
Joan Feigenbaum. CPSC457/557: Sensitive Information in a Wired World.
Private Information Retrieval
Projects
Courses
This web site is currently maintained by Rajeev Motwani. Feel free to send contributions for this page.