Stanford Algorithms Seminar: 2005-06 Calendar

Stanford Algorithms Seminar (formerly known as AFLB) is usually held in Gates 498 or Theory Lounge (Gates 463A). It is run by Krishnaram Kenthapadi and Tim Roughgarden. Click here for directions.

Schedule:

25 May 2006. Krishnaram Kenthapadi (Stanford University) : Balanced Allocation on Graphs
18 May 2006. Sergei Vassilvitskii (Stanford University) : The Complexity of ICP (Iterative Closest Point) and k-means: Lower Bounds and Smoothed Analysis
11 May 2006. Constantinos Daskalakis : The Complexity of Nash Equilibrium
4 May 2006. Anupam Gupta (CMU) : Oblivious Network Design
27 Apr 2006. Venkatesan Guruswami (University of Washington) : Achieving Channel Capacity Against Malicious Errors
20 Mar 2006. Vijay V. Vazirani (Georgia Institute of Technology) : New Market Models and Algorithms
16 Mar 2006. Andre Allavena (University of Waterloo) : LOTs (Leaf Only Trees): Fast, Efficient and Robust In-Network Computation
2 Mar 2006. Ravi Kumar (Yahoo Research) : Testing properties of distributions in sublinear time
9 Feb 2006. Chris Umans (Caltech) : Group-theoretic Algorithms for Matrix Multiplication
26 Jan 2006. Thomas Hayes (UC Berkeley) : Playing the k-armed bandit with adaptive payouts
1 Dec 2005. Kunal Talwar (MSR-SVC) : Bypassing the Embedding: Algorithms for low dimensional metrics
16 Nov 2005. Michael Mitzenmacher (Harvard) : Some New Results for An Old Data Structure, Bloom Filters
10 Nov 2005. Bobby Kleinberg (UC Berkeley) : Approximation Algorithms for Confluent Flow
20 Oct 2005. Amin Saberi (Stanford) : An Algorithmic Study of Ad Auctions
6 Oct 2005. Shuchi Chawla (Stanford / Wisconsin) : Algorithms for path-planning
13 Sep 2005. Silvio Micali (MIT) : Rational Secure Computation and Ideal Mechanism Design

25 May 2006 (4:00pm, Gates 498)

Krishnaram Kenthapadi (Stanford University)

Balanced Allocation on Graphs

It is well known that if n balls are inserted into n bins, with high probability, the bin with maximum load contains (1+o(1)) log n / loglog n balls. Azar, Broder, Karlin, and Upfal showed that instead of choosing one bin, if d >= 2 bins are chosen at random and the ball inserted into the least loaded of the d bins, the maximum load reduces drastically to loglog n / log d + O(1). However, in the context of hashing applications, the bins may represent hash buckets of a hash table stored on a disk and hence the cost of accessing the two bins (for d=2) depends on the distance between them.
We study the two choice balls and bins process when balls are not allowed to choose any two random bins, but only bins that are connected by an edge in an underlying graph. We show that for n balls and n bins, if the graph is almost regular with degree n^\eps, where \eps is not too small, the previous bounds on the maximum load continue to hold. Precisely, the maximum load is loglog n + O(1/\eps) + O(1). So even if the graph has degree n^{\Omega(1/loglog n)}, the maximum load is O(loglog n). For general \Delta-regular graphs, we show that the maximum load is loglog n + O(\frac{log n}{log (\Delta/log^4 n)}) + O(1) and also provide an almost matching lower bound of loglog n + \frac{log n}{log (\Delta log n)}. Further this does not hold for non-regular graphs even if the minimum degree is high.
Voecking showed that the maximum bin size with d choice load balancing can be further improved to O(loglog n / d) by breaking ties to the left. This requires d random bin choices. We show that such bounds can be achieved by making only two random accesses and querying d/2 contiguous bins in each access. By grouping a sequence of n bins into 2n/d groups, each of d/2 consecutive bins, if each ball chooses two groups at random and inserts the new ball into the least-loaded bin in the lesser loaded group, then the maximum load is O(loglog n / d) with high probability. Furthermore, it also turns out that this partitioning into aligned groups of size d/2 is also essential in achieving this bound.
Joint work with Rina Panigrahy. The talk is based on our SODA 2006 paper.

18 May 2006 (4:00pm, Gates 498)

Sergei Vassilvitskii (Stanford University)

The Complexity of ICP (Iterative Closest Point) and k-means: Lower Bounds and Smoothed Analysis

The k-means method and the ICP algorithm are two widely used local search heuristics. While both run stunningly fast in practice, their theoretical complexity has been poorly understood. We begin by presenting exponential time lower bounds for both algorithms. To reconcile the difference between the observed and theoretical running times we turn to smoothed analysis of Spielman and Teng. In this setting all points are randomly perturbed before running the algorithm. This step breaks brittle lower bounds and allows us to prove that the smoothed complexity of both algorithms is independent of the dimensionality of the data, thus yielding a polynomial smoothed upper bound for ICP, and an O(n^k) smoothed upper bound for k-means.

11 May 2006 (4:30pm, Gates 498)

Constantinos Daskalakis (UC Berkeley)

The Complexity of Nash Equilibrium

In 1951, Nash showed that every game has a mixed Nash equilibrium. His proof is a reduction to Brouwer's fixpoint theorem and places the problem of finding equilibria into the realm of 'exponential existence proofs'. In fact, whether Nash equilibria can be computed in polynomial time has remained open since that time, and has come under increased scrutiny during the past two decades. In this talk, I will present some recent results (joint work with Paul Goldberg and Christos Papadimitriou) that shed light to this problem, in the negative direction. I will, also, present positive results (joint work with Alex Fabrikant and Christos Papadimitriou) that advocate how "the game world is flat".

4 May 2006 (4:00pm, Gates 498)

Anupam Gupta (CMU)

Oblivious Network Design

Consider the following network design problem: given a network $G = (V,E)$, source-sink pairs $\{s_i, t_i\}$ arrive and desire to send a unit of flow between themselves. The cost of the routing is this: ifedge $e$ carries a total of $f_e$ flow (from all the terminal pairs), the cost is given by $\sum_e \load(f_e)$, where $\load$ is some concave cost function; the goal is to minimize the total cost incurred. However, we want the routing to be oblivious: when terminal pair $(s_i, t_i)$ makes its routing decisions, it does not know the current flow on the edges of the network, nor the identity of the other pairs in the system. Moreover, it does not even know the identity of the function $\load$, merely knowing that $\load$ is a concave function of the total flow on the edge. How should it (obliviously) route its one unit of flow? Can we get competitive algorithms for this problem?
We develop a framework to model oblivious network design problems (of which the above problem is a special case), and give algorithms with poly-logarithmic competitive ratio for problems in this framework (and hence for this problem).
Joint work with MohammadTaghi Hajiaghayi and Harald Raecke.

27 Apr 2006 (4:15pm, Packard 202)

Venkatesan Guruswami (University of Washington)

Achieving Channel Capacity Against Malicious Errors

Suppose you want to communicate a message of k packets on a noisy communication channel. So you judiciously encode it as a redundant collection of n = ck packets and transmit these. What is the fewest number of correct packets one needs to receive in order to have any hope of recovering the message?
Well, clearly one needs at least k correct packets. In this talk, I will describe recent work that gives an encoding scheme that attains this information-theoretic limit: for any desired eps > 0, it enables recovery of the message as long as at least k(1+eps) packets are received intact. The location of the correct packets and the errors on the remaining packets can be picked adversarially by the channel.
This achieves the optimal trade-off (called "capacity") between redundancy and error-resilience for a malicious noise model where the channel can corrupt the transmitted symbols arbitrarily subject to a bound on the total number of errors. Previously, capacity-achieving error-correcting codes were only known for weaker, stochastic models of the channel noise.
These results are obtained in a error-recovery model called list decoding. The talk will introduce and motivate the problem of list decoding, and then give a peek into the algebraic ideas and constructions that lead to the above result. We will also describe some challenging questions that still remain open.
The talk will be self-contained. Based on joint work with Atri Rudra.

20 Mar 2006 (2:00pm, Gates 463A)

Vijay V. Vazirani (Georgia Institute of Technology)

New Market Models and Algorithms

The notion of a ``market'' has undergone a paradigm shift with the Internet -- totally new and highly successful markets have been defined and launched by companies such as Google, Yahoo!, Amazon, MSN and Ebay. Another major change is the availability of massive computational power for running these markets in a centralized or distributed manner.
In view of these new realities, the study of market equilibria, an essentially non-algorithmic theory within Mathamatical Economics, needs to be revived and rejuvenated with new models, ideas, and an inherently algorithmic approach.
In this work we embark on an in-depth study of resource allocation markets. We give strongly polynomial algorithms for finding equilibria in several of these markets; interestingly enough, these algorithms also solve certain nonlinear convex programs. Our algorithms shed light into several aspects of these markets, including efficiency, fairness, competition monotonicity, and rationality of solutions.

Based on joint work with Kamal Jain:
http://www-static.cc.gatech.edu/fac/Vijay.Vazirani/EG.pdf
and Deeparnab Chakrabarty and Nikhil Devanur:
http://eccc.hpi-web.de/eccc-reports/2006/TR06-029/index.html

16 Mar 2006 (4:00pm, Gates 498)

Andre Allavena (University of Waterloo)

LOTs (Leaf Only Trees): Fast, Efficient and Robust In-Network Computation

The need for efficient computation of functions of global state lies at the heart of a wide range of problems in distributed systems. Examples include load balancing, leader election and barrier synchronization, distributed file system maintenance, sensor fusion, coordinated intrusion detection, and top-K queries in stream-oriented databases. More precisely, we wish to compute a function g on the values held by the nodes. In practice, computing such functions involves in-network computation of partial functions, to avoid the overhead of exchanging Omega(n) sized messages.
Epidemic (or gossip) algorithms are known as the most robust and efficient way to disseminate information in a network. However, they suffer from the "double counting" problem. Solutions to this problem either make gossip brittle or depend on variants of the Flajolet-Martin algorithm, which results in errors of up to 50%.
In this paper, we present the first efficient, accurate and robust algorithm for in-network computation of functions of global state. We provide a simple abstraction of a B-tree on which arbitrary functions can be evaluated. The scheme is fast and efficient: it terminates in O(ln n) communication rounds and every node sends most of its messages to nearby nodes. Most useful functions like sum or approximate histogram can be embedded in the tree, resulting in messages of size O(ln n). For functions that need to be distorted to be embedded into the tree, the price is not loss of precision but larger messages, of size proportional to the degree of distortion. Our analysis includes bounds on degree of load imbalance and tolerance of node failures.

2 Mar 2006 (3:15pm, Gates 498)

Ravi Kumar (Yahoo Research)

Testing properties of distributions in sublinear time

The focus of this talk will be sublinear time algorithms for testing properties of distributions. The input is a discrete distribution accessible via sampling and the question is to determine if the distribution has a certain property or is far from every distribution with that property. Interesting properties include closeness in various norms, independence, monotonicity in one and higher dimensions, entropy, etc.

9 Feb 2006 (3:00pm, Gates 498)

Chris Umans (Caltech)

Group-theoretic Algorithms for Matrix Multiplication

We present a group-theoretic approach to producing fast algorithms for matrix multiplication. In this framework, one devises algorithms by constructing non-abelian groups with certain properties. The algorithms themselves are natural and are based on taking the discrete Fourier transform over these groups.
We construct several families of groups that achieve matrix multiplication exponent significantly less than 3 (but not better than the current best bound, 2.376...). This leads to two appealing conjectures, one combinatorial and the other algebraic. Either one would imply that the exponent of matrix multiplication is 2.
This is joint work with Henry Cohn, Bobby Kleinberg, and Balazs Szegedy.
The talk is based on two papers:
1. "A group-theoretic approach to fast matrix multiplication" (FOCS 2003) http://front.math.ucdavis.edu/math.GR/0307321
2. "Group-theoretic algorithms for matrix multiplication" (FOCS 2005) http://www.cs.caltech.edu/~umans/papers/CKSU05.pdf

26 Jan 2006 (4:00pm, Gates 498)

Thomas Hayes (UC Berkeley)

Playing the k-armed bandit with adaptive payouts

The "k-armed bandit" is a modified slot machine, which has k arms you can pull, rather than just one. Each arm has a different payout, which can change every time you play, but is always in the range [0,1]. If we get to play this machine T times, how well can we do, compared to the best single arm in hindsight? Specifically, the goal is to minimize "regret", defined as the random variable Payout(algorithm) minus Payout(best arm).
The character of this problem depends strongly on how much information is given to the algorithm. In the "full information" version of the game, the algorithm is told the payout for all k arms after each round. This version of the problem is well-understood, and randomized algorithms are known which, with high probability, guarantee regret O(\sqrt{T log k}), which is optimal. In the "bandit" version of the game, the algorithm is only told the payout for the arm it chose in the last round. This setting is harder, with a lower bound of Omega(\sqrt{T k}), and previously best upper bound O(T^{2/3}) (for k fixed).
We present a new randomized algorithm for the bandit version, which achieves regret O(\sqrt{T k log k}) with high probability.
This is joint work with Varsha Dani.

1 Dec 2005 (4:15pm, Gates 498)

Kunal Talwar (MSR-SVC)

Bypassing the Embedding: Algorithms for low dimensional metrics

The doubling dimension of a metric is the smallest $k$ such that any ball of radius $2r$ in the metric can be covered using $2^k$ balls of radius $r$. This concept for abstract metrics has been proposed as a natural analog to the dimension of a Euclidean space. If we could embed metrics with low doubling dimension into low dimensional Euclidean spaces, they would inherit several algorithmic and structural properties of the Euclidean spaces. Unfortunately however, such a restriction on dimension does not suffice to guarantee embeddibility.
In this talk, we'll explore the option of bypassing the embedding for several applications. We'll derive results analogous to Euclidean spaces for low dimensional abstract metric spaces. In particular, we'll show that low dimensional abstract spaces admit: quasipolynomial time approximation scheme for various optimization problems including the TSP; short approximate distance labels; compact approximate shortest path routing schemes.

16 Nov 2005 (4:00pm, Gates 498)

Michael Mitzenmacher (Harvard)

Some New Results for An Old Data Structure, Bloom Filters

We present some new results for Bloom filters, a randomized data structure for set membership queries based on hashing. Bloom filters require very small amounts of space, but allow false positives; they may return that a query item is in the set even if it is not.
We first show that Bloom filters, which normally require k hash functions, can be modified so that only 2 hash functions are required, with no effect on the asymptotic false positive probability. This may simplify the use of Bloom filters in many practical situations.
We then show how Bloom filters can be used in conjunction with the power of two choices to design very effective hash tables, suitable for use in routers and other situations that require highly efficient and effective hash tables under strict memory limitations.
Time permitting, we will also present further results of our ongoing work on the Bloom filter data structure.

10 Nov 2005 (4:00pm, Gates 498)

Bobby Kleinberg (UC Berkeley)

Approximation Algorithms for Confluent Flow

Packet flows in IP networks are confluent, meaning that at any node all the flow for a given destination exits along one edge. To what extent does this affect the congestion properties of the flow? This problem has a combinatorial component --- the problem of characterizing the worst-case gap between ordinary and confluent flow --- as well as an algorithmic component: the problem of computing a confluent flow which approximately minimizes congestion. We will present nearly tight solutions to both problems in the case of single-sink confluent flows, by demonstrating:
(A) For any single-sink instance of the confluent flow problem, one can compute, in polynomial time, a solution whose congestion is at most 1+ln(k) times the minimum congestion of an ordinary (splittable) flow, where k is the degree of the sink node.
(B) The gap between confluent and splittable flow can be as large as ln(k) in some instances.
(C) The minimum congestion of a confluent flow is Omega(log k)-hard to approximate, even on single-sink instances.
For single-sink confluent flow instances in k-connected graphs, these results can be substantially improved: there exists a confluent flow whose congestion is within a factor of 2 of the optimal splittable flow. Surprisingly, the proof of this fact uses techniques from algebraic topology.
These results suggest many attractive open problems regarding confluent flow. At the end of the talk we will survey some of these problems.
This is joint work with Jiangzhuo Chen, Laszlo Lovasz, Rajmohan Rajaraman, Ravi Sundaram, and Adrian Vetta.

20 Oct 2005 (4:00pm, Gates 498)

Amin Saberi (Stanford)

An Algorithmic Study of Ad Auctions

The main source of the dramatic revenues of Google, Yahoo and MSN is the innovative auction used by these search engine companies for advertising. These auctions allow other companies to bid for ads displayed along with the results of any search keyword. There are a number of algorithmic and game theoretic issues that arise in this context. I will talk about several abstractions which lead to very interesting questions in the context of algorithms and mechanism design. I will briefly talk about the problem of online allocation of advertising space to achieve optimal performance. Then I will discuss designing truthful auctions when the bidders have budget restrictions or when the goods are arriving in an online fashion.

6 Oct 2005 (4:00pm, Gates 498)

Shuchi Chawla (Stanford / Wisconsin)

Algorithms for path-planning

Consider a FedEx delivery man who has to deliver packages to various addresses in the city. Each package has a value and a time-window within which it must be delivered. The goal is to maximize the total value of packages delivered within their time-windows. "Path-planning" problems such as the above arise in a variety of applications such as robot navigation and supply chain management. We present the first approximation, an O(log^2 n) approximation, for this "time-windows" problem, as well as a constant factor approximation for its special case, Orienteering, where each package has the same delivery deadline.

13 Sep 2005 (4:00pm, Gates 498)

Silvio Micali (MIT)

Rational Secure Computation and Ideal Mechanism Design

We put forward and implement Rational Secure Computation, a stronger notion of secure computation that does not depend on players� honesty, but solely on their rationality.

The key to our result is showing that the ballot box---the venerable device used throughout the world to privately and correctly compute the tally of secret votes---can actually be used to securely compute ANY function of secret inputs.

Our work bridges the fields of Game Theory and Cryptography, and has broad implications for Mechanism Design. In particular, we exhibit a polynomial-time algorithm that, given ANY social-choice function, finds a mechanism that implements it, if any exists.

Joint work with Sergei Izmalkov and Matt Lepinski.

Schedule:

25 May 2006 (4:00pm, Gates 498)

Krishnaram Kenthapadi (Stanford University)

18 May 2006 (4:00pm, Gates 498)

Sergei Vassilvitskii (Stanford University)

11 May 2006 (4:30pm, Gates 498)

Constantinos Daskalakis (UC Berkeley)

4 May 2006 (4:00pm, Gates 498)

Anupam Gupta (CMU)

27 Apr 2006 (4:15pm, Packard 202)

Venkatesan Guruswami (University of Washington)

20 Mar 2006 (2:00pm, Gates 463A)

Vijay V. Vazirani (Georgia Institute of Technology)

16 Mar 2006 (4:00pm, Gates 498)

Andre Allavena (University of Waterloo)

2 Mar 2006 (3:15pm, Gates 498)

Ravi Kumar (Yahoo Research)

9 Feb 2006 (3:00pm, Gates 498)

Chris Umans (Caltech)

26 Jan 2006 (4:00pm, Gates 498)

Thomas Hayes (UC Berkeley)

1 Dec 2005 (4:15pm, Gates 498)

Kunal Talwar (MSR-SVC)

16 Nov 2005 (4:00pm, Gates 498)

Michael Mitzenmacher (Harvard)

10 Nov 2005 (4:00pm, Gates 498)

Bobby Kleinberg (UC Berkeley)

20 Oct 2005 (4:00pm, Gates 498)

Amin Saberi (Stanford)

6 Oct 2005 (4:00pm, Gates 498)

Shuchi Chawla (Stanford / Wisconsin)

13 Sep 2005 (4:00pm, Gates 498)

Silvio Micali (MIT)