CS369: Metric Embeddings and Algorithmic Applications
Instructor: Tim
Roughgarden (Gates 462)
Time/location:
11AM-12:15 PM on Mondays and Wednesdays in Gates B8.
Course description:
Sample topics: low-distortion embeddings of finite metrics into L_1, L_p,
and distributions of trees; dimensionality reduction;
volume-respecting embeddings; applications to graph partitioning,
online algorithms, network design, and nearest neighbor search.
Prerequisites: CS261 or comparable mathematical maturity.
Course overview:
- Introduction (1 lecture).
- Probabilistic tree embeddings, applications
to network design (3-4 lectures).
- Embeddings into L_2 and L_1, applications to multicommodity flows
and cuts (3-4 lectures).
- Dimensionality reduction in L_1 and L_2, with applications
(1-2 lectures).
- Growth-restricted metrics, notions of dimension, applications
to nearest neighbor search and/or beacon-based embeddings.
(1-3 lectures).
- Embeddings of other important classes of metrics
(e.g. edit distance) (0-2 lectures).
- Volume-respecting embeddings, with applications to bandwidth
(0-1 lectures).
- Embeddings of L_2^2, with connections to semidefinite
programming and sparest cuts (4 lectures).
General references on metric embeddings:
Detailed schedule and references:
- Wed 1/11: Introduction: Concave-cost network design and
graph partitioning.
- Relevant reading: Section 15.1 of the Matousek chapter, scribes notes
from Gupta/Ravi course (first two lectures plus first section
of third lecture).
- Mon 1/16: No class (MLK Jr. Day).
- Wed 1/18: Introduction to probabilistic tree embeddings.
Bartal's (first) theorem achieving O(log^2 n) max expected stretch.
For expository coverage of tree embeddings and Bartal's theorem, see:
- Lectures 8 and 9 from the Gupta/Ravi course;
- Fakcharoenphol/Rao/Talwar,
Approximating Metrics by Tree Metrics (survey),
SIGACT News 2004;
- Shmoys/Williamson, "Cuts and Metrics" chapter from a forthcoming
book on Approximation Algorithms (contact instructor for a copy).
Probabilistic tree embeddings were first considered in
and the original reference for Bartal's theorem is
- Mon 1/23: The FRT theorem: optimal probabilistic tree embeddings.
See the surveys from last lecture, and also the original paper
is quite readable:
- Wed 1/25: Algorithmic applications of probabilistic
tree embeddings. The general result about removing Steiner
nodes from a tree metric while preserving all distances to
within a factor of 8 is from
The application to buy-at-bulk (in a slightly
different formulation) is from:
See also the FRT SIGACT survey and the first three lectures from the
Gupta/Ravi course listed above. The derandomization framework
is from
see also the FRT SIGACT survey. Other classic applications of
probabilistic tree
embeddings to approximation algorithms (that we didn't have time to
cover) include the group Steiner tree problem, where the problem is already
hard to approximate better than O(log^2 n) on trees:
and the metric labeling problem, where the special HST structure
(rather than merely tree structure) of the above probabilistic
tree embeddings is used in a non-trivial way:
- Mon 1/30: Probabilistic embedding of a graph into
one of its spanning trees with maximum expected stretch O(log^2 n).
Reference:
The problem was first considered in the Alon et al. paper
listed in Lecture #2 above. The big breakthrough (polylogarithmic
maximum expected stretch) came in
- Wed 2/1: Finish DGR algorithm and analysis.
- Mon 2/6:
Preliminaries about L_1 metrics.
Bourgain's theorem (embedding arbitrary metrics into
L_1 with O(log n) distortion) via the FRT theorem.
Intro to the max concurrent flow and sparsest cut problems.
Bourgain's Theorem originally appeared in
- J. Bourgain, On Lipschitz embedding of finite metric spaces
in Hilberg space, Israel Journal of Mathematics, 52:46-52, 1985.
- Wed 2/8: Matching upper and lower bounds (of O(log n)) for the
sparsest cut/maximum concurrent flow gap in general graphs with
general demands. Application of sparsest cut to approximate
balanced cuts.
References:
- Mon 2/13: Frechet embeddings; original proof of Bourgain's
theorem (O(log n) distortion for embedding n-point metrics into L_p,
any 1 <= p < \infty).
High-level idea for Rao's theorem (embedding planar metrics into
L_1 and L_2 with O(\sqrt{log n}) distortion). References:
- This stronger version of Bourgain's theorem and the proof
are from the original '85 paper above.
- Highly recommended reading: Sections 15.7 and 15.8 from the
Matousek chapter.
- Wed 2/15: Proof of Rao's theorem.
Other relevant stuff we won't have time for:
- Even though series-parallel graphs (let alone planar) graphs
cannot be embedded into Euclidean space with better than O(\sqrt{\log
n}) distortion (see the 2/27 lecture below), they
can be embedded into
l_1 with O(1) distortion: Gupta/Newman/Rabinovich/Sinclair,
Cuts, trees
and l_1-embeddings of graphs, FOCS '99/Combinatorica '04.
- So can k-outerplanar graphs for fixed k:
Chekuri/Gupta/Newman/Rabinovich/Sinclair, Embedding
k-Outerplanar Graphs into l_1,
SODA '03.
- Mon 2/20: No class (Presidents' Day).
- Wed 2/22: Dimensionality reduction in L_2 (Johnson-Lindenstrauss);
embedding L_2 isometrically into L_1.
- Main reference: lectures 15 and 16 from the Gupta/Ravi course.
- For a long list of applications, see Section 3 of the Indyk
FOCS '01 survey above.
- The JL lemma is originally from Johnson/Lindenstrauss,
Extensions of Lipschitz mappings into a Hilbert space, Contemporary
Mathematics, 1984.
- The version using independent Gaussians is from
Indyk/Motwani, Approximate
Nearest Neighbors: Towards
Removing the Curse of Dimensionality, STOC '98.
- See also Dasgupta/Gupta, An
elementary proof of a theorem of Johnson and Lindenstrauss,
Random Structures and Algorithms '03.
- For a version that uses only -1/+1 random variables, see
Achlioptas, Database-friendly
Random Projections:
Johnson-Lindenstrauss with Binary Coins, PODS '01/JCSS '03.
- Mon 2/27: Lower bounds for Euclidean embeddings: Enflo's theorem,
the diamond graphs, and the optimality of Rao's theorem.
- Main reference: Section 15.4 of the Matousek chapter.
- The diamond graphs were introduced in Gupta/Newman/Rabinovich/Sinclair,
Cuts, trees
and l_1-embeddings of graphs, FOCS '99/Combinatorica '04.
In this paper, they were used to show that the FRT theorem is tight
even for series-parallel graphs (i.e., probabilistic tree embeddings
must incur \Omega(\log n) max expected stretch for such graphs).
- In Newman/Rabinovich, A
lower bound on the distortion
of embedding planar metrics into Euclidean space, SCG '02/DCG '03,
they were used to show that Rao's theorem is tight: series-parallel,
and hence planar, graphs can require \Omega(\sqrt{\log n}) distortion
to embed into Euclidean space.
- Wed 3/1: Impossibility of dimensionality reduction in L_1.
- Mon 3/6: Doubling metrics and nearest neighbor search. References:
For more on nearest-neighbor search in doubling metrics, see
For more algorithms for various problems in doubling metrics, see e.g.
- Wed 3/8: The ARV algorithm for uniform sparsest cut.
References:
- Mon 3/13: More on the ARV algorithm for uniform sparsest cut.
- Wed 3/15: Finish the ARV algorithm for uniform sparsest cut.
Bonus lectures (Spring 2006):
- Tue 4/25: Introduction to the bandwidth problem; construction of
volume-respecting embeddings. Our main references are:
Recent related papers include:
- Dunagan/Vempala, On
Euclidean embeddings and bandwidth
minimization, APPROX 2001 (a bit better approximation ratio);
- Rao,
Small
distortion and volume preserving
embeddings for planar and Euclidean metrics, SCG '99 (a different
construction of volume-respecting embeddings, plus improved results
for graphs excluding a fixed minor);
- Gupta, Improved
Bandwidth Approximation for Trees and
Chordal Graphs, SODA 2000/J Algorithms 2001 (improved results for
trees);
- Feige/Talwar, Approximating
the Bandwidth of Caterpillars, APRROX 2005 (better bounds for
caterpillars);
- Unger, The Complexity of the Approximation of the Bandwidth
Problem, FOCS 1998 (inapproximability results);
- Vempala, Random
Projection: A New Approach to VLSI
Layout, FOCS 1998 (application of
volume-respecting embeddings to graph partitioning).
For an overview of some of this work, see Chapter 4 of the following
monograph:
- Tue 5/2: Finish construction of volume-respecting embeddings.
Analysis of Feige's bandwidth algorithm. (See references from
last week.)
- Tue 5/9: Euclidean embeddings of negative type metrics
and the (general) sparsest cut problem.
We will focus on:
For the state of the art, see the O(log n log log n)-distortion
embedding (again, from l_2^2 to l_2) in:
- Tue 5/16: Euclidean embeddings of negative type metrics
and the (general) sparsest cut problem (con'd); see references
above.
- Tue 5/30: Aspects of measured descent; a third proof of
Bourgain's theorem. Main reference:
To learn about measured descent more generally, see
as well as the more general "Gluing Lemma" in