CS265/CME309: Randomized Algorithms and Probabilistic Analysis, Fall 2017



Announcements:

  • Solutions to the final: HERE. Please glance through them, especially if you think we made a grading mistake....
  • Fixed some typos and provided extra clarification to some questions in the final, also changed the parameter 15 to 20 in problem 2, and added the stipulation in 4(c) that it is alright to give an answer that is accurate up to some constant factor. New version uploaded Thursday at 10:45pm. Corresponding Piazza post is here.
  • The take-home final is HERE. It is due at 11:59pm on Wednesday, 12/13. Any corrections/clarifications will be posted both here, and on piazza.
  • The last problem set, number 8 is HERE. It will be due at 10am Friday, 12/8.
  • The take-home final will be posted Thursday 12/7 and will be do at 11:59pm Wednesday 12/13.
  • Fixed one more silly typo in ps7 prob 1. (updated 4pm 11/21).
  • Fixed some minor typos in ps7 (updated 3:30pm 11/20). Sorry for any confusion this might have caused.
  • Problem set 7 is HERE. It will be due at 10am Friday, 12/1.
  • Midterm solutions are posted here. The mean/median/std were 42/43/11 (and the max was 62+2). We didn't count the final part of the last question (worth 1 point) towards the total, and no one any points on it.
  • For problem 1 in ps6, it is very helpful to use the version of the LLL that allows for a directed dependency graph (i.e. each edge (i,j) in the dependency graph is directed, and we require that each event A_i is mutually independent of the events {A_j : directed edge (i,j) is not in the graph}. The LLL holds in this directed setting where the "degree" of an event/node is the out-degree, and our proof from class actually works for this stronger version. I have updated ps6 to clarify this. Sorry for the confusion. (updated 6pm, 11/14)
  • Problem set 6 is HERE. It will be due at 10am Friday, 11/17.
  • Practice midterm (from last year) here.
  • The MIDTERM will be in-class, Tuesday Nov 7th, in LATHROP 282. The test is closed-book, though you may bring 1 double-sided 8.5x11 page of notes, that you must have prepared yourself.
  • 10/31: Problem set 5 is now due at 10am on Friday, 11/3.
  • 10/26: Problem set 5 is HERE. The problem set is due before class next Thursday (10am, 11/2). [If you work with a partner, remember to pick a NEW partner every week!]
  • 10/19: Problem set 4 is HERE. The problem set is due before class next Thursday (10am, 10/26).
  • 10/16: I wrote a fairly detailed addendum/context for ps3, problem 4 (the adaptive data science/hyper-parameter tuning problem). Hopefully it will resolve some of the questions that have come up (and please read it before posting questions to Piazza). Addendum/Context HERE.
  • 10/12: Problem set 3 is HERE. The problem set is due before class next Thursday (10am, 10/19). Please submit via Gradescope.
  • 10/6: Videos for this week's lectures are now posted. Thanks Brandon!!! [All lecture videos will be posted to the same location: HERE.
  • 10/5: Problem set 2 is HERE. The problem set is due before class next Thursday (10am, 10/12). Please submit via Gradescope.
  • 10/1: Brandon has generously offered to video lectures. There may be a lag of 24hrs or so before the hi-res videos are available, but I'll post the links as soon as they are up. Thursday's lecture video is HERE.
  • 9/28: Problem set 1 is HERE. The problem set is due before class next Thursday (10am, 10/5). Please submit via Gradescope, and check the problem set/partner policies outlined below.
  • 9/26: Tentative date for the in-class midterm: Nov 7th.
  • 9/26: I moved my office hours to Tuesdays, 4-5pm (to avoid the conflict with CS300).
  • 9/24: We have a Piazza page for class discussions HERE . This is for your benefit, though you do not need to join it. The TAs and I will do our best to monitor it and will try to answer any questions asked on the forum, though please refer to this course website for all announcements/corrections, etc. Also, please feel free to discuss/answer questions that your classmates post.

  • Instructor: Gregory Valiant (Office hours: Tues 4:00-5:00, Gates 470. Contact info: send email to my last name at stanford dot edu)

    Teaching Assistants:
    Vaggos Chatziafratis (Email first name at stanford.)
    Mitchell McIntire (Email mcint286 at stanford.)
    Hongyang Zhang (Email hongyz at stanford.)

    Office Hours:
    Monday 3:20-4:20, Huang 304 (Hongyang)
    Monday 5:30-6:30, Gates 460 (Hongyang)
    Tuesday 4-5, Gates 470 (Greg)
    Wednesday 9-11am, Gates 460 (Vaggos)
    Wednesday 2:30-4:30, Gates B30 (Mitchell)
    Wednesday 6:30-8:30, Gates 460 (Vaggos/Hongyang)

    Class Time/location: 10:30-11:50 Tue/Thu. Location: STLC 115

    Course description: Randomness pervades the natural processes around us, from the formation of networks, to genetic recombination, to quantum physics. Randomness is also a powerful tool that can be leveraged to create algorithms and data structures which, in many cases, are more efficient and simpler than their deterministic counterparts. This course covers the key tools of probabilistic analysis, and applications of these tools to understand the behaviors of random processes and algorithms. Emphasis is on theoretical foundations, though we will apply this theory broadly, discussing applications in machine learning and data analysis, networking, and systems.

    Topics: Markov and Chebyshev inequalities, Chernoff bounds, random graphs, and expanders, moment generating functions, metric embeddings, the probabilistic method, Lovasz Local Lemma, Markov chains and random walks, MCMC, martingales, stopping times, Azuma-Hoeffding inequality, and many powerful and elegant randomized algorithms whose analyses rely on the above tools.

    Prerequisites: CS 161 and STAT 116, or equivalents and instructor consent.

    Textbook: Mitzenmacher and Upfal, Probability and Computing
    Lecture notes will be provided and posted here for the material that we cover which is not in the book. For those of you with the earlier edition of the book, we will also post notes for material that is in the newer edition but not the original version. For the material we cover that IS in the book, the treatment in class will frequently be in slightly greater depth than that in the book, with greater emphasis on more recent developments and open problems, though we will not provide additional lecture notes for this material.

    Grading: 50% problem sets, 20% midterm exam, 30% final exam.

    Problem Set Policies: Late problem sets will not be accepted, though your two lowest problem set scores will be dropped. We strongly encourage using LaTex to typeset your problem sets, and have a LaTex template that you may use. We will be using the Gradescope online submission system. You should receive an email saying that you've been enrolled in CS265/CME309 from Gradescope. If not, create an account on Gradescope using your Stanford ID and join CS265/CME309 using entry code 9BPKZB.

    You are encouraged to find a partner, and hand in 1 problem set with both names. You must have a different partner for each problem set. As always, collaboration and discussion of the problems is encouraged, though you must understand everything that you hand in, and the writeup that your partner and you hand in must be your own group's writing.




    Lecture 1 (9/26): Introduction, models of computation, randomized polynomial identity testing algorithm (Schwartz-Zippel).

    Lecture 2 (9/28): Randomized min-cut algorithm [sec 1.5 in Prob. and Comp. edition 2, sec 1.4 in P&C ed. 1], linearity of expectation, coupon-collecting [sec 2.4 in P&C ed 1 and 2], and analysis of Quicksort with random pivoting [sec 2.5 in P&C].
    [Lecture 2 video].

    Problem Set 1 HERE. Due 10am, Thursday 10/5.

    Lecture 3 (10/3): Randomized Primality Testing. This is what originally put randomized algorithms on the map back in the late 1970's, and modern cryptography as we know it would not be possible without the existence of fast randomized primality checking algorithms. Lecture notes HERE.

    Lecture 4 (10/5): Proof of correctness of Rabin/Miller randomized primality testing algorithm (see notes from Lecture 3).

    Problem Set 2 HERE. Due 10am, Thursday 10/12.

    Lecture 5 (10/10): Markov's inequality and Chebyshev's inequality (sections 3.1, 3.2, and 3.3 in Prob. and Comp.). Moment generating functions and Chernoff bounds (sections 4.1-4.3 in Prob. and Comp.

    Lecture 6 (10/17): Randomized routing on the hypercube (section 4.6 in Prob. and Comp.

    Problem Set 3 HERE. Due 10am, Thursday 10/19.

    Lecture 7 (10/17): Balls in bins, the "Poissonization" technique (Sections 5.2-5.5 in Prob. and Comp.) and "Power of Two Choices" (section 17.1 in Prob. and Comp.)

    Lecture 8 (10/19): Metric embeddings, and the probabilistic embedding of any n-point metric into L1 with O(log(n)) distortion (due to Bourgain). Lecture notes here.

    Problem Set 4 HERE. Due 10am, Thursday 10/26.

    Lecture 9 (10/24): Metric embeddings continued, Johnson-Lindenstrauss [JL] dimension reduction, and high level discussion of "Fast Johnson-Lindenstrauss" transform. Lecture notes here contain proof of JL and discussion of locality sensitive hashing (an application of JL). We didn't cover the Fast JL in much detail, and I havent added it to the notes yet: the original paper is here, and a higher level more concise treatment is here.

    Lecture 10 (10/26): The Probabilistic Method: bounding Ramsey numbers, derandomization via conditional expectations. (Chapter 6 in Prob. and Comp, NOT including Lovasz Local Lemma sections).

    Problem Set 5 HERE. Due 10am, Friday 11/3.

    Lecture 11 (10/31): Lovasz Local Lemma, and examples. (Section 6.7 in Prob. and Comp.)

    Lecture 12 (11/2): Constructive LLL. In class we saw Moser's "entropic" proof--see Terry Tao's blog post for a very nice summary here. Lecture notes from two years ago, covering the algorithm and formal theorem statement and a more general (but less pretty) proof via Galton-Watson branching processes here. Also see the 2014 paper of Achlioptas and Iliopoulos' extending Moser's entropic arguments here.

    In-class midterm (11/7) Solutions here.

    Lecture 13 (11/9): Intro to Markov Chains, and analysis of randomized 2-SAT algorithm. (Section 7.1-7.2 in Prob. and Comp.)

    Problem Set 6 HERE. Due 10am, Friday 11/17.

    Lecture 14 (11/14): Fundamental Theorem of Markov Chains/stationary distributions (sections 7.3 and 7.4 in Prob. and Comp.), Markov Chain Monte Carlo (Section 11.4)

    Lecture 15 (11/16): Mixing Times, Strong Stationary Times, and Coupling (chapter 11 in Prob. and Comp.).

    Problem Set 7 HERE. Due 10am, Friday 12/1.

    Lecture 16 (11/29): Martingales, the Doob Martingale, and Azuma-Hoeffding tail bounds (sections 13.1, 13.4, and 13.5 in Prob. and Comp.).

    Lecture 17 (11/31): Martingales, stopping times and the martingale stopping theorem (section 13.2 in Prob. and Comp.) and Wald's Equation (section 13.3). A formal proof of the stopping theorem is not given in the text---see the notes here for a formal proof, and a bit more about the formality required for proving these martingale results.

    Problem Set 8 HERE. Due 10am, Friday 12/8.

    Lecture 18 (12/5): Crash course on coding theory. [This material will not be on the final exam.]

    Lecture 19 (12/7): Discussion of some recent research related to the course.