CS 361A - Autumn Quarter 2005-06
(Advanced Data Structures and Algorithms)
News Flash Administrivia Signup Overview Handouts/Homeworks Lecture Schedule Readings
Please note the change in office hours for Dilys Thomas.
Administrivia
Instructor: Rajeev Motwani
Teaching Assistant: Dilys Thomas (dilys@stanford.edu)
Class Schedule: Mon/Wed,
3:15-4:30, Gates B08
Office Hours:
Dilys
Thomas
Tue/Thu 2-4pm (Location: Gates 482, Phone: 723-4532)
Rajeev Motwani Mon/Wed,
4:30pm (right after the lectures)
Course URL: http://theory.stanford.edu/~rajeev/cs361.html
Class Sign-up To sign up for this course, please send email to Dilys Thomas with the following information: name, department, status (Phd/MS/UG, year), area (Databases, Systems, Theory, etc), and email address. Please also include registration status (credit, pass/fail/audit).
Mailing Lists and Newsgroup We have set up a class mailing list to help you get the latest information regarding the class. The email lists are auto-populated using current course enrolment information. The main list will be cs361a-aut0506-all@lists.stanford.edu. Those who audit the course can subscribe by sending an email to majordomo@lists.stanford.edu with the following text in the body of the mail: subscribe cs361a-aut0506-guests. Those who are auditing the course and have filled the signup sheet in class on September 28 have already been added onto the guest list. We also have a newsgroup su.class.cs361a for the class.
Grading Since this course will be treated as a graduate research seminar, we expect that most students will register pass/fail (and not for a letter grade). If you do choose to sign up for a letter grade, be sure to mention this in your sign-up email to the TA. We will give out 3-4 homeworks, one of which will serve as a take-home midterm exam. There will no final exam. The scores on these homeworks as well as class participation will determine your final grade.
Efficient strategies for complex data-structuring problems are essential in the design of fast algorithms for a variety of applications, including combinatorial optimization, databases and data mining, information retrieval and web search, and geometric applications. We will give a systematic exposition of the central ideas in the design of such data structures. The second main theme of this course will be the design and analysis of online algorithms and data stream algorithms. The field of competitive analysis of online algorithms got its start in the amortized analysis for data structures and forms a natural extension of some of the ideas we will discuss in the earlier part of the course. We will present some of the main ideas and motivating applications for this class of algorithms. Time permitting, we will also cover some topics in the related area of algorithms and data structures in the stream model of computation. The material to be covered will be drawn from the following list:
Advanced Data Structures: hash tables (universal hashing, perfect hashing, locality-sensitive hashing, Bloom filters); data structures for combinatorial optimization (union-find, Fibonacci heaps, dynamic trees, dynamic graph structures); self-adjusting data structures (lists, splay trees); search trees (red-black trees, self-adjusting trees, treaps, skip lists, finger search trees, biased search trees); fault-tolerant and persistent data structures; suffix trees and string searching; databases/data-mining/data-stream (histograms, indexes, hashing, synopses and sketches, sliding windows); geometric and kinetic data structures.
Online and Data Stream Algorithms: paging/caching problems; abstractions (k-server problem, request-answer games, and metrical task systems); scheduling and load balancing; network algorithms; data migration/replication in distributed computing; stream algorithms and data structures for database problem.
This course should be of interest to graduate students in computer science and related fields, especially those with a mathematical bent of mind. We will assume familiarity with basic material in algorithms, combinatorics, and probability theory (at the level of the core undergraduate courses on these topics).
Handout |
Date |
Topic |
Download |
1 |
Mon, Sep 26 |
Reading List |
|
2 |
Mon, Sep 26 |
Notes for Lectures 1 & 2 |
|
3 |
Mon, Oct 3 |
Notes for Lectures 3 & 4 |
hard-copy only |
4 |
Mon, Oct 10 |
Notes for Lecture 5 |
hard-copy only |
5 |
Mon, Oct 10 |
Notes for Lecture 6 |
hard-copy only |
6 |
Mon, Oct 10 |
Homework 1 |
|
7 |
Wed, Oct 12 |
Notes for Lectures 7 & 8 |
hard-copy only |
9 |
Wed, Oct 19 |
Notes for Lectures 9 & 10 |
hard-copy only |
10 |
Mon, Oct 24 |
Solutions for Homework 1 |
hard-copy only |
11 |
Mon, Oct 24 |
Homework 2 |
|
12 |
Wed, Oct 26 |
Notes for Lecture 11 |
hard-copy only |
13 |
Wed, Oct 26 |
Notes for Lecture 12 |
hard-copy only |
14 |
Wed, Nov 2 |
Notes for Lecture 13 |
hard-copy only |
15 |
Wed, Nov 2 |
Notes for Lecture 14 |
hard-copy only |
16 |
Wed, Nov 9 |
Additional slides for Lecture
14 |
|
17 |
Wed, Nov 9 |
Notes for Lecture 15 |
|
18 |
Wed, Nov 9 |
Notes for Lecture 16/17 |
|
19 |
Mon, Nov 14 |
Solutions for Homework 2 |
hard-copy only |
20 |
Mon, Nov 14 |
Homework 3 |
|
21 |
Mon, Nov 28 |
Notes for Lecture 18 |
|
22 | Wed, Nov 30 | Notes for Lecture 19 | |
23 | Wed, Nov 30 | Notes for Lecture 20 | |
24 |
Wed, Nov 30 |
Homework 4 |
Lecture |
Date |
Topic |
Lecture Notes |
1 |
Mon, Sep 26 |
||
3 |
Mon, Oct 3 |
||
5 |
Mon, Oct 10 |
Handout 4 (hard-copy only) |
|
6 |
Wed, Oct 12 |
Handout 5 (hard-copy only) |
|
7 8 |
Mon, Oct 17 Wed, Oct 19 |
Handout 7 (hard-copy only) |
|
9 10 |
Mon, Oct 24 Wed, Oct 26 |
Handout 9 (hard-copy only) |
|
11 |
Mon, Oct 31 |
Handout 12 (hard-copy only) |
|
12 |
Wed, Nov 2 |
Handout 13 (hard-copy only) |
|
13 |
Mon, Nov 7 |
Handout 14 (hard-copy only) |
|
14 |
Wed, Nov 9 |
Handout 16 (ppt) Handout 15 (hard-copy only) |
|
15 |
Mon, Nov 14 |
Handout 17 (ppt) |
|
16 |
Wed, Nov 16 Mon, Nov 28 |
Handout
18 (ppt) |
|
18 |
Wed, Nov 30 |
Handout 21 (ppt) |
|
19 | Mon, Dec 5 |
|
Handout 22 (ppt) |
20 | Wed, Dec 7 | Data Mining: Association Rules | Handout 23 (ppt) |
Text-books: There is no required text-book for this course but the following may be useful for some of the we plan to cover. At least the first three are recommended (but not required), the rest are useful reading if you are interested in delving further into the topics.
Randomized
Algorithms, R. Motwani and P. Raghavan,
Data Structures and Network
Algorithms, R.E. Tarjan,
Online Computation and
Competitive Analysis, A. Borodin and R. El-Yaniv,
Introduction to Algorithms, T.H. Cormen, C.E.Leiserson, R.L. Rivest, and C. Stein, McGraw-Hill, 2002.
Managing Gigabytes: Compressing and Indexing Documents and Images, I.H. Witten, A. Moffat and T. C. Bell, Morgan Kauffman, 1999.
Modern Information Retrieval, Baeza-Yates and Ribeiro-Neto, Addison-Wesley, 1999.
Lectures 1 and 2 - Should tables be sorted?
Lectures 3 and 4 - Hashing: Universal and Perfect
Lecture 5 - Amortization and List Update Problem
Lecture 6 - Disjoint Sets and Union-Find
Lectures 7 and 8 - Competitive Analysis and Paging
Lectures 9 and 10 - Randomized Online Algorithms
Lecture 11 - Self-Adjusting Search Trees
Lecture 12 - Treaps: Randomized Search Trees
Lecture 14 (part 1) - Caching Queues
Lecture 14 (part 2) - Self-Adjusting and Fibonacci Heaps
Lecture 15 -- Hashing for Massive/Streaming Data
Lecture 16 and 17 -- Synopses, Samples, and Sketches
Lecture 18 -- Fingerprints, Min-Hashing, and Document Similarity
Nearest Neighbors
Random Projections
Lecture 20 -- Data Mining: Association Rules