SAS is the Stanford Algorithms Seminar (formerly the AFLB -- Algorithms for Lunch Bunch). It is run by Mayur Datar and Kamesh Munagala. Meetings are generally Thursdays at 2:45 p.m.

Talks are held in the Gates Building, near the Main Quad of Stanford's campus. Click here for directions.

## Contents:

1. 13 Oct 2000 : Mikkel Thorup (AT&T Labs) . Approximate Distance Oracles.
2. 9th Nov 2000 : Michael Mitzenmacher (Harvard University) . Using Multiple Hash Functions to Improve IP Routing.
3. 16th Nov 2000 : Alistair Sinclair (UC Berkeley) . Approximating the permanent.
4. 15th March 2001 : David Williamson (IBM Almaden) . An application of complex semidefinite programming to approximation algorithms.
5. 19th April 2001 : D. Sivakumar (IBM Almaden) . A sieve algorithm for the shortest lattice vector problem.
6. 25th April 2001 : Tami Tamir (Technion) . Algorithms for Packing and Scheduling with Constraints.
7. 17th May 2001 : Chandra Chekuri (Bell Labs) . Algorithms for Minimizing Weighted Flow Time.
8. 31st May 2001 : Ziv Bar-Yossef (Berkeley). Sampling Algorithms: Lower Bounds and Applications.

## 13th Oct 2000

Gates Building 400. 3:00 p.m.

## Mikkel Thorup (AT&T Labs)

### Approximate Distance Oracles

Let $G=(V,E)$ be an undirected weighted graph with $|V|=n$ and $|E|=m$. Let $k$ be a fixed positive integer. We show that $G$ can be preprocessed in $\O(mn^{1/k})$ expected time, constructing a data structure of expected size $O(n^{1+1/k})$, such that any subsequent distance query can be answered, approximately, in $O(1)$ time. The approximate distance returned is of stretch at most $2k-1$, i.e., it is at least as large as the actual distance and is at most $2k-1$ times the actual distance. The amount of space used by our data structure is essentially optimal relative to the stretch. Previously, data structures that used only $O(n^{1+1/k})$ space had a query time of $\Omega(n^{1/k})$ and a slightly worst stretch.

This is joint work with Uri Zwick.

## 9th Nov 2000

Gates Building 498. 5:30 p.m.

## Michael Mitzenmacher (Harvard University)

### Using Multiple Hash Functions to Improve IP Routing

High performance Internet routers require a mechanism for very efficient IP address look-ups. Some techniques used to this end, such as binary search on levels, need to construct quickly a good hash table for the appropriate IP prefixes. In this talk we describe an approach for obtaining good hash tables by using multiple hashes on each IP address. The methods we describe are fast, simple, scalable, parallelizable, and flexible.

The double hashing technique is of general interest for applications othter than IP routing. In particular, in instances where the goal is to have each hash bucket fit into a single cache line, using multiple hashes proves extremely suitable. We provide a general analysis of this hashing technique as well as demonstrate its performance for binary search on levels.

Includes joint work with Berthold Voecking and Andrei Broder.

## 16th Nov 2000

Gates Building 498. 3:00 p.m.

## Alistair Sinclair (UC Berkeley)

### Approximating the permanent

I will present a fully-polynomial randomized approximation scheme for the permanent of an arbitrary matrix with non-negative entries. This problem had been open for some time.

Joint work with Mark Jerrum and Eric Vigoda.

## 15th March 2001

Gates Building 498. 4:15 p.m.

## David Williamson (IBM Almaden)

### An application of complex semidefinite programming to approximation algorithms

A number of recent papers on approximation algorithms have used the square roots of unity, -1 and 1 to represent binary decision variables for problems in combinatorial optimization, and have relaxed these to unit vectors in real space using semidefinite programming in order to obtain near optimal solutions to these problems. In this talk, we consider using the cube roots of unity, 1, e^{i 2\pi/3}, and e^{i 4\pi/3}, to represent ternary decision variables for problems in combinatorial optimization. Here the natural relaxation is that of unit vectors in complex space. We use an extension of semidefinite programming to complex space to solve the natural relaxation, and use a natural extension of the random hyperplane technique to obtain near-optimal solutions to the problems.

In particular, we consider the problem of maximizing the total weight of satisfied equations x_u - x_v \equiv c (mod 3) and inequations x_u - x_v \not\equiv c (mod 3), where x_u \in {0,1,2} for all u. This problem can be used to model the MAX 3-CUT problem and a directed variant we call MAX 3-DICUT. For the general problem, we obtain a .79373-approximation algorithm. If the instance contains only inequations (as it does for MAX 3-CUT), we obtain a performance guarantee of 7/12 + (3/(4\pi^2)) arccos^2(-1/4), which is about .83601. This compares with proven performance guarantees of .800217 for MAX 3-CUT (by Frieze and Jerrum) and 1/3 + 10^{-8} for the general problem (by Andersson, Engebretson, and Hastad). A .83601-approximation algorithm for MAX 3-CUT has also been discovered independently by de Klerk, Paschenik, and Warners.

This is joint work with Michel X. Goemans of MIT.

## 19th April 2001

Gates Building 498. 2:45 p.m.

## D. Sivakumar (IBM Almaden)

### A sieve algorithm for the shortest lattice vector problem

We will describe a randomized algorithm to compute the shortest non-zero vector in an n-dimensional rational lattice. The shortest lattice vector problem (SVP) is important in several applications in the areas of computational algebra, coding theory, cryptanalysis, factoring, integer programming, and public-key cryptography.

Our SVP algorithm runs in time exp(c n), where n is the dimension of the lattice, and c is a constant. The previous best time upper bound for this problem is from a deterministic algorithm due to Ravi Kannan (1983); his algorithm runs in time exp(d n log n) for some constant d. We will mention some consequences of our algorithm for related problems on lattices and codes, including an improvement for polynomial time approximations to the shortest vector problem.

The exposition will be fairly self-contained. This is joint work with Miklos Ajtai and Ravi Kumar.

## 25th April 2001

Gates Building 498. 4:15 p.m.

## Tami Tamir (Technion)

### Algorithms for Packing and Scheduling with Constraints

We consider variants of the classic knapsack problem, in which sets of items of different types need to be placed in multiple bins. The items may be of different sizes and values. Each bin has a limited capacity, and a bound on the number of different types of items it can hold. In the class-constrained multiple knapsack problem (CCMK) we wish to maximize the total value of the packed items. In the fair placement problem (FPP) our goal is to place the same (large) portion from each set. In the class-constrained bin-packing problem (CCBP), we seek to minimize the number of (identical) bins, needed for packing all the items. These problems are motivated by problems arising in storage management for multimedia systems, and in production planning.

A similar constraint can be presented in scheduling problems, where a set of $n$ jobs needs to be scheduled on $m$ uniform machines. Each job, $j$, has a length and an allotment parameter, $a_j$, which specifies the maximal number of machines that can share its execution. The cases where $a_j=1$ (non-preemptive scheduling), and $a_j=m$ (preemptive scheduling), are well-studied. We consider the problem of minimizing the completion time of all jobs (the Makespan) when each job has an arbitrary allotment parameter $1 < a_j < m$.

We present hardness results, approximation algorithms, and characterize classes of instances for which exact solutions can be found efficiently. For example, the scheduling problem is shown to be strongly NP-hard when all the jobs have $a_j = 3$, but solvable optimally when for all the jobs $a_j \ge 5$ (the case where $\forall j, a_j=4$ is open). For the CCMK and the CCBP problems we present polynomial time approximation schemes. Optimal algorithms are presented for the CCMK and FPP problems with identical knapsacks and unit-sized items.

The talk is based on my Ph.D thesis. Thesis advisor: Dr. Hadas Shachnai.

## 17th May 2001

Gates Building 498. 4:15 p.m.

## Chandra Chekuri (Bell Labs)

### Algorithms for Minimizing Weighted Flow Time

We consider the following natural scheduling problem. Jobs arrive over time to be scheduled on a single machine. Each job has a processing time and a weight. We seek to minimize the sum of weighted flow times of the jobs. The flow time of a job is the overall time the job spends in the system from the time it arrives to the time its processing is completed. We allow preemption of jobs. The algorithm that at any time schedules the job with shortest remaining processing time (SRPT) is known to be optimal when all jobs have the same weight, however no non-trivial algorithms were known for the weighted problem.

We discuss some new on-line and off-line algorithms for the weighted problem on a single machine. We show a semi-online algorithm for the single machine problem that has an O(\log^2 P) competitive ratio. Here P is the ratio of the maximum job size to the minimum job size. In the off-line setting we show that a (2+\eps)-approximation is achievable in quasi-polynomial time. We show an \Omega(\sqrt{P}) lower bound on online algorithms for multiple machines, this is in contrast to the un-weighted flow time for which SRPT achieves a logarithmic competitive ratio. We also discuss improved ratios for minimizing total stretch on multiple machines, an interesting special case of weighted flow time, where the weight of a job is inversely proportional to its processing time.

Time permitting, we will discuss some recent work that yields approximation schemes for weighted flow time.

Joint work with Sanjeev Khanna and An Zhu.

## 31st May 2001

Gates Building 498. 4:15 p.m.

## Ziv Bar-Yossef (Berkeley)

### Sampling Algorithms: Lower Bounds and Applications.

We develop a framework to study probabilistic sampling algorithms that approximate general functions of the form f: A^n ---> B, where A and B are arbitrary sets. Our goal is to obtain lower bounds on the query complexity of functions, namely the number of input variables x_i that any sampling algorithm needs to query to approximate f(x_1,...,x_n).

We define two quantitative properties of functions - the block sensitivity and the minimum Hellinger distance - that give us techniques to prove lower bounds on the query complexity. These techniques are quite general, easy to use, yet powerful enough to yield tight results. Our applications include the mean and higher statistical moments, the median and other selection functions, and the frequency moments, where we obtain lower bounds that are close to the corresponding upper bounds.

Joint work with Ravi Kumar and D. Sivakumar.