CS 361A - Autumn 2003-04
(Advanced Data Structures and Algorithms)

News Flash    Administrivia    Signup    Overview    Handouts/Homeworks    Lecture Schedule    Readings


News Flash

Homework 4 has just been released..


        Instructor:   Rajeev Motwani
        Teaching Assistant: Krishnaram Kenthapadi
        Class Schedule: Mon/Wed, 3:15-4:30, Room 380-380W
Office Hours:  
                                Krishnaram Kenthapadi
             Tue 12:45-2:30 and Thu 1:15--2:30, Gates 510 (5-3903)
                                Rajeev Motwani                        Tue, 1:30-2:30, Gates 474 (3-6045)

Course URL:  http://theory.stanford.edu/~rajeev/cs361-2003.html


Class Sign-up To sign up for this course, please send email to Krishnaram Kenthapadi with the following information: name, department, status (Phd/MS/UG, year), area (Databases, Systems, Theory, etc), and email address.  Please also include registration status (credit, pass/fail/audit).

Class Mailing List We have set up a class mailing list to help you get the latest information regarding the class.  The email lists are auto-populated using current course enrolment information. The main list will be: cs361a-aut0304-all@lists.stanford.edu . Those who audit the course can subscribe by sending an email to majordomo@lists.stanford.edu with the following text in the body of the mail: subscribe cs361a-aut0304-guests

Grading Since this course will be treated as a graduate research seminar, we expect that most students will register pass/fail (and not for a letter grade). If you do choose to sign up for a letter grade, be sure to mention this in your sign-up email to the TA. We will give out 3-4 homeworks, one of which will serve as a take-home midterm exam. There will no final exam. The scores on these homeworks as well as class participation will determine your final grade.


Course Overview

Efficient strategies for complex data-structuring problems are essential in the design of fast algorithms for a variety of applications, including combinatorial optimization, databases and data mining, information retrieval and web search, and geometric applications. We will give a systematic exposition of the central ideas in the design of such data structures. The second main theme of this course will be the design and analysis of online algorithms and data stream algorithms. The field of competitive analysis of online algorithms got its start in the amortized analysis for data structures and forms a natural extension of some of the ideas we will discuss in the earlier part of the course. We will present some of the main ideas and motivating applications for this class of algorithms. Time permitting, we will also cover some topics in the related area of algorithms and data structures in the stream model of computation. The material to be covered will be drawn from the following list:

Advanced Data Structures:  hash tables (universal hashing, perfect hashing, locality-sensitive hashing, Bloom filters); data structures for combinatorial optimization (union-find, Fibonacci heaps, dynamic trees, dynamic graph structures); self-adjusting data structures (lists, splay trees); search trees (red-black trees, self-adjusting trees,  treaps, skip lists, finger search trees, biased search trees); fault-tolerant and persistent data structures; suffix trees and string searching; databases/data-mining/data-stream  (histograms, indexes, hashing, synopses and sketches, sliding windows); geometric and kinetic data structures.

Online and Data Stream Algorithms: paging/caching problems; abstractions (k-server problem, request-answer games, and metrical task systems); scheduling and load balancing; network algorithms; data migration/replication in distributed computing; stream algorithms and data structures for database problem.

This course should be of interest to graduate students in computer science and related fields, especially those with a mathematical bent of mind. We will assume familiarity with basic material in algorithms, combinatorics, and probability theory (at the level of the core undergraduate courses on these topics).


Handouts and Homeworks


Handout Date Topic Download
1 Wed, Sep 24 Reading List ps or pdf
2 Wed, Sep 24 Notes for Lectures 1/2/3 ps or pdf
3 Wed, Oct 1 Notes for Lecture 3/4 hard-copy only
4 Mon, Oct 6 Notes for Lecture 5 hard-copy only
5 Wed, Oct 8 Homework 1 ps or pdf
6 Mon, Oct 13 Notes for Lecture 6 hard-copy only
7 Wed, Oct 15 Notes for Lectures 7/8 hard-copy only
8 Mon, Oct 20 Homework 2 ps or pdf
9 Wed, Oct 22 Notes for Lectures 9/10 hard-copy only
10 Wed, Oct 29 Notes for Lecture 11 hard-copy only
11 Mon, Nov 3 Homework 3 ps or pdf
12 Mon, Nov 3 Notes for Lecture 12 hard-copy only
13 Wed, Nov 5 Notes for Lecture 13 hard-copy only
14 Mon, Nov 10 Notes for Lecture 14 hard-copy only
15 Wed, Nov 12 Slides for Lecture 15 ppt
16 Mon, Nov 17 Slides for Lecture 16/17 ppt
17 Wed, Nov 19 Homework 4 ps or pdf
18 Mon, Nov 24 Slides for Lecture 18 ppt
19 Wed, Nov 26 Slides for Lecture 19 ppt
20 Mon, Dec 1 Slides for Lecture 20 ppt
21 Wed, Dec 3 Slides for Lecture 21 ppt



Lecture Schedule

Lecture Date Topic Lecture Notes
Wed, Sep 24
Mon, Sep 29
Should Tables be Sorted? Handout 2 (ps, pdf)
Wed, Oct 1
Mon, Oct 6
Hashing: Universal and Perfect Handout 2 (ps, pdf)
Handout 3 (hard-copy only)
5 Wed, Oct 8 Amortization and List Update Handout 4 (hard-copy only)
6 Mon, Oct 13 Disjoint Sets and Union-Find Handout 6 (hard-copy only)
Wed, Oct 15
Mon, Oct 20
Competitive Analysis and Paging Handout 7 (hard-copy only)
Wed, Oct 22
Mon, Oct 27
Randomized Online Algorithms Handout 9 (hard-copy only)
11 Wed, Oct 29 Self-Adjusting Search Trees Handout 10 (hard-copy only)
12 Mon, Nov 3 Treaps: Randomized Search Trees Handout 12 (hard-copy only)
13 Wed, Nov 5 Skip Lists Handout 13 (hard-copy only)
14 Mon, Nov 10 Self-Adjusting and Fibonacci Heaps Handout 14 (hard-copy only)
15 Wed, Nov 12 Hashing for Massive/Streaming Data Handout 15 (ppt)
Mon, Nov 17 Wed, Nov 19 Synopses, Samples, and Sketches Handout 16 (ppt)
18 Mon, Nov 25 Fingerprints, Min-Hashing and Document Similarity Handout 18 (ppt)
19 Wed, Nov 26 Data Mining: Association Rules Handout 19 (ppt)
20 Mon, Dec 1 Bloom Filters Handout 20 (ppt)
21 Wed, Dec 3 Near Neighbors Handout 21 (ppt)



Reading List

Text-books: There is no required text-book for this course but the following may be useful for some of the we plan to cover. At least the first three are recommended (but not required), the rest are useful reading if you are interested in delving further into the topics.

Lectures 1 and 2 - Should tables be sorted?

Lectures 3 and 4 - Hashing: Universal and Perfect

Lecture 5 - Amortization and List Update Problem


Lecture 6 - Disjoint Sets and Union-Find


Lectures 7 and 8 - Competitive Analysis and Paging


Lectures 9 and 10 - Randomized Online Algorithms


Lecture 11 - Self-Adjusting Search Trees


Lecture 12 - Treaps: Randomized Search Trees

Lecture 13 - Skip Lists


Lecture 14 - Self-Adjusting and Fibonacci Heaps


Lecture 15 -- Hashing for Massive/Streaming Data


Lecture 16 and 17 -- Synopses, Samples, and Sketches

Lecture 18 -- Fingerprints, Min-Hashing, and Document Similarity


Lecture 19 -- Data Mining: Association Rules


Lecture 20 -- Bloom Filters


Lecture 21 -- Near Neighbors

  Nearest Neighbors

    Random Projections