CS 361A
(Advanced Algorithms for Internet Applications)

News Flash      Administrivia      Sign-up      Course Overview     Topics      Schedule      Reading List


News Flash

Midterm Exam We have prepared a take-home midterm exam which is due in class on Wednesday, Nov 13. The exam is available here in postscript and pdf formats.


        Instructors:   Rajeev Motwani and Nina Mishra
        Teaching Assistant: Mayur Datar

        Class Schedule: Mon/Wed, 3:15-4:30, Gates B12
Office Hours:  
Mayur Datar]            Thu/Fri, 1:30-2:30, Gates 482
                                [Nina Mishra]             Thu, 3:00-4:00, Gates 484
                                [Rajeev Motwani]        Tue, 1:30-2:30, Gates 474

Class Mailing List We have set up a class mailing list. Please subscribe to it to get latest information regarding the class. The email address is cs361a-class@lists.stanford.edu. You can subscribe by sending a mail to majordomo@lists.stanford.edu with the following text in the body of the mail: subscribe cs361a-class

Grading Policy Revision Several MS students want to take this course on for a letter grade to satisfy their specialization requirements. We have modified our sign-up policy to allow this, so feel free to sign up for a letter grade. We will use the scribe notes (see below), a couple of homeworks, and class participation to determine the grade for these students. So, if you sign up for a letter grade, be sure to serve as a scribe for at least one lecture.



To sign up for this course, please send email to Mayur Datar with the following information:
                name, department, status (Phd/MS/UG, year), area of specialization (Databases, Systems, Theory, etc), and email address.


Course Overview

With the maturing of the Internet, the field of algorithms is undergoing an interesting transformation. For one thing, new areas and applications requiring an algorithmic mind-set have emerged, such as information retrieval and web searching, massive and streaming data, data mining, machine learning, distributed systems (including so-called P2P networks),  and network algorithms. To service these, novel algorithmic techniques have been and are being developed. Furthermore, new applications have led to new models for algorithms, most prominent of which is the field of algorithms for data streams. This course will give an overview of such topics with an eye towards identifying interesting research directions. Since Stanford people have played a prominent role in these new developments, wherever possible we will attempt to bring in as guest lecturers the original authors of the papers being covered. This course should be of interest to graduate students in computer science and related fields, especially those with a mathematical bent of mind. We will assume familiarity with basic material in algorithms, databases, probability, etc., (at the level of the core undergraduate courses on these topics).

Grading Since this course will be treated as a graduate research seminar, we expect students will register pass/fail (and not for a letter grade). There will be little by way of formal exams, although we may have occasional homework assignments.  In fact, most of the grade will depend on class participation and the scribe notes prepared by students (see below).

Scribe Each registered student will sign up as the official scribe for a specific lecture. This involves taking detailed notes, reading the background papers, and preparing a set of lecture notes that will be handed out to the entire course.







Week Dates Topic Lecturer Slides Scribe Notes
1 Wed, Sep 25 Introduction: Computing Distinct Values Rajeev Motwani Slides 1 (ppt) Scribe 1 (ps, pdf)
2 Mon, Sep 30 Data Streams 1 (Sampling/Sketching/Synopses) Rajeev Motwani
Mayur Datar
Slides 2 (ppt)
Slides 3 (ppt)
Scribe 2 (ps, pdf)
Wed, Oct 2
3 Mon, Oct 7 Data Streams 2 (Histograms/Quantiles) Rajeev Motwani
Gurmeet Manku
Slides 4 (ppt)
Slides 5 (ppt)
Scribe 4.1 (doc, pdf)
Scribe 4.2 (ps, pdf)
Wed, Oct 9
4 Mon, Oct 14 Association Rules Rajeev Motwani
Nina Mishra
Slides 6 (ppt)
Slides 7(ppt)
Scribe 6.1 (doc, pdf)
Scribe 6.2 (ps, pdf)
Scribe 7 (ps, pdf)
Wed, Oct 16
5 Mon, Oct 21 Clustering Nina Mishra Slides 8 (ppt)
Slides 9 (ppt)
Scribe 8 (doc, pdf)
Scribe 9.1 (ps, pdf)
Scribe 9.2 (doc, pdf)
Wed, Oct 23
6 Mon, Oct 28 Machine Learning Nina Mishra Slides 10 (ppt)
Slides 11 (ppt)
Scribe 10 (ps, pdf)
Scribe 11 (ps, pdf)
Wed, Oct 30
7 Mon, Nov 4 Nearest Neighbors and Similarity Aris Gionis
Rajeev Motwani
Slides 12 (ps, pdf)
Slides 13 (ppt)
Scribe 12.1 (doc, pdf)
Scribe 12.2 (doc, pdf)
Scribe 13 (ps, pdf)
Wed, Nov 6
8 Mon, Nov 11 External Memory Algorithms Rajeev Motwani
Kamesh Munagala
Slides 14 (ppt)
Slides 15 (ppt)
Scribe 15 (doc, pdf)
Wed, Nov 13
9 Mon, Nov 18 Web Graph and Link Analysis Monika Henzinger
Glen Jeh
Taher Haveliwala
Slides 16 (ppt)
Slides 17 (ppt)
Scribe 16 (doc, pdf)
Scribe 16.2 (doc, ps)
Scribe 17.1 (doc, pdf)
Scribe 17.2 (doc, pdf)
Scribe 17.3 (doc, pdf)
Wed, Nov 20
10 Mon, Nov 25 Network Algorithms Balaji Prabhakar Slides 18 (ps)
Slides 19 (ps)
Scribe 19.1 (ps, pdf)
Scribe 19.2 (doc, pdf)
Wed, Nov 27
11 Mon, Dec 2 Distributed Hashing and P2P Networks
Removing Duplicates
 Datar/ Motwani
Andrei Broder
Slides 20 (ppt)  
Wed, Dec 4



Reading List

Introduction: Computing Distinct Values - Rajeev Motwani

Data Streams 1 (Sampling/Sketching/Synopses) - Rajeev Motwani/Mayur Datar





Data Streams 2 (Synopses/Algorithms) - Rajeev Motwani/Gurmeet Manku

    Quantiles and Histograms

    Sliding Window Algorithms

Association Rules - Rajeev Motwani/Nina Mishra

    Association Rule Mining and Generalizations

    Combinatorics of Association Rules

    Frequency Counting


Clustering - Nina Mishra

    Basic Clustering Algorithms

    Clustering Large Data Sets and Streams

    Database Clustering

    Spectral Clustering

Machine Learning - Nina Mishra

Similarity and Nearest Neighbors - Aris Gionis/Rajeev Motwani


    Nearest Neighbors

    Random Projections

External Memory Algorithms - Kamesh Munagala

Web Graph and Link Analysis - Monika Henzinger/Glen Jeh/Taher Haveliwala

    PageRank and Hubs-Authorities

    Personalized PageRank

Network Algorithms - Balaji Prabhakar

Distributed Hashing and P2P Networks  - Mayur Datar


Additional Topics


Epidemics, Gossiping, and Rumor Mongering


OLAP and Datacubes


Fuzzy Information and Aggregation Algorithms



Indexing and Searching:

Linear Algebra in Information Retrieval

Error-correcting Codes