SAS is the Stanford Algorithms Seminar (formerly the AFLB -- Algorithms for Lunch Bunch). It is run by Craig Silverstein. Meetings are generally Thursdays at 4:15 p.m.

Talks are held in the Gates Building, near the Main Quad of Stanford's campus. Click here for directions.

## Contents:

1. 9 October 1996: Roy Armoni (Hebrew University). Discrepancy Sets and Pseudorandom Generators for Combinatorial Rectangles.
2. 17 October 1996: Bruce Donald (Cornell). Algorithmic Foundations for Planar Force-Field Devices: Geometric Algorithms for Massively-Parallel Distributed Manipulation.
3. 24 October 1996: Monika Henzinger (Digital SRC and Cornell). New Bounds for Computing Vertex Connectivity.
4. 31 October 1996: Danny Raz (ICSI). Approximating Total Flow Time on Parallel Machines.
5. 21 November 1996: Moses Charikar (Stanford). Dynamic Servers: A Model for Kinematic Structures and Video-on-Demand.
6. 9 January 1997: Marshal Bern (Xerox PARC). Meshing, Morphing, and Mapping.
7. 16 January 1997: Andrew Yao (Princeton). New-Element Identification and the Pigeonhole Principle.
8. 23 January 1997: Nina Amenta (Xerox PARC). The Shape of a Set of Points .
9. 30 January 1997: Mohan Paturi (UC San Diego). Exponential Lower Bounds for Depth 3 Boolean Circuits.
10. 6 February 1997: Amin Shokrollahi (ICSI). Practical Loss-Resilient Codes.
11. 13 February 1997: Jon Kleinberg (Cornell University and IBM Almaden). Nearest-Neighbor Search in High Dimensions.
12. 6 March 1997: Monika Henzinger (DEC). Fully Dynamic Minimum Spanning Trees in O(n^{1/3} log n) time per operation.
13. 18 March 1997: Sanjeev Arora (Princeton). Nearly Linear Time Approximation Schemes for Euclidean TSP and Other Geometric Problems.
14. 10 April 1997: Edith Cohen (UC-Berkeley and AT&T Labs). Approximating Matrix Multiplication for Pattern Recognition Tasks.
15. 1 May 1997: Susanne Albers (Max Planck Institut fur Informatik, Saarbrucken). Better Bounds for Online Scheduling.
16. 8 May 1997: Martin Dietzfelbinger (Universitaet Dortmund). The Linear-Array Problem in Communication Complexity Resolved .
17. 15 May 1997: Yoram Singer (AT&T Labs). Using and Combining the Advice of Experts that Specialize.
18. 17 July 1997: Moni Naor (Weizmann Institute). Number-Theoretic Constructions of Efficient Pseudo-random Functions and Other Cryptographic Primitives.

## 9 October 1996

Gates Building 498. 4:15 p.m.

## Roy Armoni (Hebrew University)

### Discrepancy Sets and Pseudorandom Generators for Combinatorial Rectangles

A common subproblem of DNF approximate counting and derandomizing RL is the discrepancy problem for combinatorial rectangles. We explicitly construct a $poly(n)$-size sample space that approximates the volume of any combinatorial rectangle in $[n]^n$ to within $o(1)$ error (improving on the constructions of Even et al. from 1992. The construction extends the techniques of Linial, Luby, Saks and Zuckerman from 1995 for the analogous hitting set problem, most notably via discrepancy preserving reductions.

Joint work with Michael Saks, Avi Wigderson, and Shiyu Zhou.

Email: aroy@cs.huji.ac.il
WWW: http://www.cs.huji.ac.il/~aroy
Related paper: To appear in FOCS '96

## 17 October 1996

Gates Building 498. 4:15 p.m.

## Bruce Donald (Cornell)

### Algorithmic Foundations for Planar Force-Field Devices: Geometric Algorithms for Massively-Parallel Distributed Manipulation

In _distributed manipulation_, a distributed system of programmable actuators manipulates its environment to perform a task. Examples of manipulation tasks include moving furniture, sorting or feeding parts, and mechanical assembly. Developing a science base for distributed manipulation foregrounds deep issues in computer science and robotics, including the parallelization of physical algorithms'' and the information structure of distribution. A wealth of geometric and algorithmic problems arise in the control and programming of manipulation systems with many independent actuators. This talk will present new experimental results as well as new theoretical results (since my last talk st Stanford). I'll focus on our progress in developing combinatorially precise geometric planning algorithms.

The manufacturing problems of parts-feeding and -orientation motivate our work on massively-parallel distributed manipulation. I'll discuss our algorithms for sensorless manipulation using SIMD arrays of microfabricated actuators. We are testing these strategies on the M-CHIP (Manipulation Chip), an array of programmable micro-motion pixels fabricated at Cornell. I'll show a prototype M-CHIP containing over 15,000 silicon "cilia" (actuators) in one square inch. This may be a kind of record for parallelism and density. We have developed and analyzed efficient SIMD manipulation strategies and planning algorithms for parts-sorting and -orienting using a computational-geometric analysis of programmable vector fields. I will also discuss new experimental results using an organic ciliary array developed at Stanford, and new results on macroscopic parts feeders.

We believe such systems could result in smart'' (i.e., programmable) manipulation surfaces, tiled with microactuators. I'll discuss the challenges in programming such systems, and in designing efficient control and planning algorithms. New upper and lower bounds are discussed, as well as connections to computational dynamical systems theory. I'll show a video of our prototype systems.

## 24 October 1996

Gates Building 498. 4:15 p.m.

## Monika Henzinger (Digital SRC and Cornell)

### New Bounds for Computing Vertex Connectivity

Consider a directed or undirected graph with weights at the nodes. The vertex connectivity \kappa of the graph is the weight of the minimum-weight set of nodes whose removal disconnects the graph.

We modify the Hao & Orlin approach for computing edge connectivity to give the fastest known deterministic and randomized algorithms for computing vertex connectivity.

Our randomized algorithm improves the running time over existing algorithms

• by a factor of n*\kappa in the weighted case, and
• by a factor of at least min (\kappa, sqrt(n)) in the unweighted case,
where n is the number of nodes in the graph.

To be precise, the randomized algorithm computes \kappa with probability 1/2 in time

• O(nm) in unweighted graphs, and in time
• O(nm \log (n^2/m)) in weighted graphs,
where m is the number of edges in the graph.

This is joint work with Satish Rao and Hal Gabow and will appear in FOCS 96.

Email: monika@pa.dec.com
WWW: http://www.cs.cornell.edu/Info/People/mhr/home.html
Related paper: http://www.cs.cornell.edu/Info/People/mhr/papers/focs96.ps

## 31 October 1996

Gates Building 498. 4:15 p.m.

## Danny Raz (ICSI)

### Approximating Total Flow Time on Parallel Machines

We consider the problem of optimizing the total flow time of a stream of jobs that arrive over time in a multiprocessor setting. This problem is $NP$-hard even when we allow preemption, and have only two machines. Although the total (or average) flow time is widely accepted as a good measurement of the overall quality of service, no approximation algorithms were known for this problem.

We prove that when preemption is allowed, {\em Shortest Remaining Processing Time} (SRPT) is an $O(\log ({\rm min}\{\frac{n}{m}, P\}))$ approximation algorithm for the total flow time, where $n$ is the number of jobs, $m$ is the number of machines, and $P$ is the ratio between the maximum and the minimum processing time of a job. We also provide an $\Omega(\log ({\rm max}\{\frac{n}{m},P\}))$ lower bound on the competitive ratio of any randomized on-line algorithm, therefore showing that SRPT is an optimal on-line algorithm.

The solution of the preemptive version of the problem is used as a basis to obtain a scheduler in the non-preemptive case. Here we give an $O(\sqrt {\frac{n}{m}}\log {\frac{n}{m}})$ approximation algorithm and an $\Omega(n^{\frac{1}{3}-\epsilon})$ lower bound on the approximability of this problem (assuming $P \not= NP$).

This is a joint work with Stefano Leonardi.

## 21 November 1996

Gates Building 498. 4:15 p.m.

## Moses Charikar (Stanford)

### Dynamic Servers: A Model for Kinematic Structures and Video-on-Demand

We present a generalization of the $k$-server problem, called {\em dynamic servers}, where the number of servers is not fixed a priori; rather, the algorithm is free to increase and decrease the number of servers at will, but it is required to pay a rental cost for each active server at designated times. This new problem is a simultaneous abstraction for problems arising in a variety of applications, particularly the information delivery problem for video-on-demand, and the problem of dynamic maintenance of kinematic structures for applications in molecular biology, simulation of hyperredundant robots, collision detection, and computer animation. The problem also appears to be of theoretical significance as a natural new paradigm in the realm of online algorithms. We give approximation algorithms for the offline problem, and initiate the study of the online version of this problem. Our results are based on a geometric reformulation of the dynamic servers problem that leads to interesting connections with Steiner trees and geometric partitioning problems, and our results may be of independent interest in that context.

This is joint work with Dan Halperin and Rajeev Motwani.

## 9 January 1997

Gates Building 498. 4:15 p.m.

## Marshall Bern (Xerox PARC)

### Meshing, Morphing, and Mapping

I will discuss two discuss two problems in computational geometry. The first is decomposing a simple polygonal region into "round" pieces. This problem arises in domain decomposition methods for solving differential equations, but I suspect that it has many other uses, for example, in computing electrical resistance or bounding the cover time of random walks on polygonal regions. The second problem is homeomorphically mapping one simple polygonal region to another. This problem arises in cartography and in animation. Both projects are ongoing, and I'm hoping to attract collaborators and follow-on work.

## 16 January 1997

Gates Building 463. 4:15 p.m.

## Andrew Yao (Princeton)

### New-Element Identification and the Pigeonhole Principle

Let A[1:n] be an array of entries from [m]={1,2,...,m} where n < m. One would like to identify an element in [m] that does not appear in the array A by adaptively asking queries A[j]=?, with the restriction that each location j is examined at most once. Clearly, there are algorithms that solve this problem using O(n) bits of memory. In this talk, we prove that any such algorithm must use at least O(n^1/2) bits of memory, and that this is nearly optimal. This result is then used to show that, for a large class of resolution proofs, the minimum size of any proof for the Pigeonhole Principle PHP(m,n) (when m pigeons are assigned to n holes, some hole must have have more than one pigeons) must be greater than c^{n^1/2}. This is joint work with A. Razborov and A. Wigderson.

## 23 January 1997

Gates Building 463. 4:15 p.m.

## Nina Amenta (Xerox PARC)

### The Shape of a Set of Points

Reconstruction of the boundary of an object from an unordered set of points is a fundamental problem in both computer vision and computer graphics, so it would be nice to have a mathematical definition of the shape'' of a set of points which reflects the idea that the points lie on a $d-1$-dimensional surface. We propose a simple definition for points in the plane, called the {\em crust}.

We can prove that if a smooth curve is sampled densely enough, then the crust of the samples approximates the curve, with no extraneous features. The minimum required sampling density varies along the curve according to the {\em Local Feature Size} (which is also simply defined), so that areas of less detail can be sampled less densely.

The crust is easy to compute using Voronoi diagrams in $O(n \lg n)$ time (even in practice). We will demonstrate the computation of crusts on arbitrary point sets.

Email: amenta@parc.xerox.com
WWW: http://www.geom.umn.edu/~nina
Related paper: (TBA)

## 30 January 1997

Gates Building 498. 4:15 p.m.

## Mohan Paturi (UC San Diego)

### Exponential Lower Bounds for Depth 3 Boolean Circuits

We consider the class $\Sigma^{k}_3$ of unbounded fan-in depth-3 boolean circuits, for which the bottom fan-in is limited by $k$ and the top gate is an OR. It is known that the smallest such circuit computing the parity function has $\Omega(2^{\varepsilon n/k})$ gates (for $k=O(n^{1/2})$ for some $\varepsilon >0$, and this was the best lower bound known for explicit (P-time computable) functions. In this talk, for $k=2$, we exhibit a function in uniform $NC^1$ that requires $2^{n - o(n)}$ size depth 3 circuits. The main tool is a theorem that shows that any $\Sigma^{2}_3$ circuit on $n$ variables that accepts $A$ inputs and has size $s$ must be constant on a projection (subset defined by equations of the form $x_i=0$,$x_i=1$, $x_i=x_j$ or $x_i=\bar{x}_j$) of dimension at least $\frac{\log(A/s)}{\log n}$.

This is joint work with Mike Saks and Francis Zane.

## 6 February 1997

Gates Building 463. 4:15 p.m.

## Amin Shokrollahi (ICSI)

### Practical Loss-Resilient Codes

We present a randomized construction of linear-time encodable and decodable codes that can transmit over lossy channels at rates extremely close to capacity. The encoding and decoding algorithms for these codes have fast and simple software implementations. Partial implementations of our algorithms are faster by orders of magnitude than the best software implementations of previous algorithms. We expect these codes will be extremely useful for applications such as real-time audio and video transmission over the Internet, where lossy channels are common and fast decoding is a requirement.

Despite the simplicity of the algorithms, their design and analysis are very intricate and elegant mathematically. The design requires the careful choice of a random irregular bipartite graph, where the structure of the irregular graph is extremely important. We model the progress of the decoding algorithm by a set of differential equations. The solution to these equations can then be expressed as polynomials in one variable with coefficients determined by the graph structure. Based on these polynomials, we design a graph structure that guarantees successful decoding with high probability. Joint work with Mike Luby, Mike Mitzenmacher, Dan Spielman and Volker Stemann.

Email: amin@icsi.berkeley.edu
WWW: http://www.icsi.berkeley.edu/~amin
Related paper: http://theory.stanford.edu/~aflb/shokrollahi97.ps

## 13 February 1997

Gates Building 498. 4:15 p.m.

## Jon Kleinberg (Cornell University and IBM Almaden)

### Nearest-Neighbor Search in High Dimensions

Representing data as points in a high-dimensional space, so as to use geometric methods for indexing, is an algorithmic technique with a wide array of uses. It is central to a number of areas such as information retrieval, pattern recognition, and statistical data analysis; many of the problems arising in these applications can involve several hundred or several thousand dimensions.

We consider the nearest-neighbor problem for $d$-dimensional Euclidean space: we wish to pre-process a database of $n$ points so that given a query point, one can efficiently determine its nearest neighbors in the database. There is a large literature on algorithms for this problem, in both the exact and approximate cases. The more sophisticated algorithms typically achieve a query time that is logarithmic in $n$ at the expense of an exponential dependence on the dimension $d$; indeed, even the average-case analysis of heuristics such as k-d trees reveals an exponential dependence on $d$ in the query time.

In this work, we develop a new approach to the nearest-neighbor problem, based on a method for combining randomly chosen one-dimensional projections of the underlying point set. We use this method to obtain, among other results, an approximate nearest-neighbor algorithm with a query time that is logarithmic in $n$ and polynomial in the dimension $d$.

WWW: http://www.cs.cornell.edu/home/kleinber/kleinber.html
Related paper: (TBA)

## 27 February 1997

Gates Building 498. 4:15 p.m.

## Michael Mitzenmacher (Digital SRC)

### Average-Case Analyses of First Fit and Random Fit Bin Packing

We prove that the First Fit bin packing algorithm is stable under the input distribution U{k-2,k} ( that is, item sizes are uniformly chosen from {1/k,2/k,...,(k-2)/k} ) for all k greater than or equal to 3, settling an open question from the recent bin-packing survey by Coffman, Garey, and Johnson.

Our proof generalizes the multi-dimensional Markov chain analysis used by Kenyon, Rabani, and Sinclair to prove that Best Fit is also stable under these distributions.

Our proof is motivated by a similar analysis of Random Fit, a new simple packing algorithm related to First Fit, that is interesting in its own right.

This is joint work with Susanne Albers.

Email: michaelm@pa.dec.com
WWW: http://www.research.digital.com/SRC/personal/Michael_Mitzenmacher/home.html
Related paper: (TBA)

## 6 March 1997

Gates Building 498. 4:15 p.m.

## Monika Henzinger (DEC)

### Fully Dynamic Minimum Spanning Trees in O(n^{1/3} log n) time per operation

The fully dynamic minimum spanning tree problem is defined as follows: Maintain the minimum spanning forest of a weighted graph G under an arbitrary sequence of edge insertions and deletions.

It requires building a data structure that after an edge insertion or deletion

1. can quickly determine if and how the minimum spanning forest is affected by the change, and
2. can quickly be updated.
The best previous data structure (Eppstein, Galil, Italiano, and Nissenzweig 1992) requires time O(n^{1/2}) per edge insertion or deletion. We present a new data structure that takes only time O(n^{1/3} log n).

We achieve this result in two steps: First we give a deletions-only data structure that takes time O(n^{1/3} log n) per operation. Second we present a general technique to combine a deletions-only minimum spanning tree data structure with an insertions-only minimum spanning tree data structure to create a fully dynamic data structure. We expect that this technique will also be useful for other dynamic graph problems.

This is joint work with Valerie King.

Email: monika@pa.dec.com
WWW: http://www.research.digital.com/SRC/personal/Monika_Henzinger/home.html
Related paper: (TBA)

## 18 March 1997

Gates Building 498. 4:15 p.m.

## Sanjeev Arora (Princeton)

### Nearly Linear Time Approximation Schemes for Euclidean TSP and Other Geometric Problems

The Euclidean Traveling Salesman Problem is NP-hard. In 1976, Christofides discovered a polynomial-time approximation algorithm that finds a salesman tour of cost at most 1.5 times the optimal.

Last year we presented a new algorithm that is an approximation scheme: for any fixed c>0, it computes a tour of cost at most 1+c times the optimal. The scheme's running time was n^{O(1/c)}.

This talk will describe an approximation scheme that is simpler and much more efficient. For any fixed c>0, it finds a (1+c)-approximate tour in n*polylog(n) time, which is nearly linear in n. The scheme also works in nearly linear time on instances in d dimensions (for fixed d).

Our algorithm generalizes to many other geometric problems, including the famous Minimum Steiner Tree problem, and the Euclidean Matching problem.

## 10 April 1997

Gates Building 498. 4:15 p.m.

## Edith Cohen (UC Berkeley and AT&T Labs)

### Approximating Matrix Multiplication for Pattern Recognition Tasks

Many pattern recognition tasks, including estimation, classification, and the finding of similar objects, make use of linear models. The fundamental operation in such tasks is the computation of the dot product between a query vector and a large database of instance vectors. Often we are interested primarily in those instance vectors which have high dot products with the query. We present a random sampling based algorithm that enables us to identify, for any given query vector, those instance vectors which have large dot products, while avoiding explicit computation of all dot products. We provide experimental results that demonstrate considerable speedups for text retrieval tasks. Our approximate matrix multiplication algorithm is applicable to products of $k\geq 2$ matrices and is of independent interest. Our theoretical and experimental analysis demonstrates that in many scenarios, our method dominates standard matrix multiplication.

Joint work with David D. Lewis.

Email: edith@CS.Berkeley.EDU
WWW: http://www.research.att.com/~edith
Related paper: http://www.research.att.com/~edith/Papers/posmat.ps.Z

## 1 May 1997

Gates Building 498. 4:15 p.m.

## Susanne Albers (Max Planck Institut fur Informatik, Saarbrucken)

### Better Bounds for Online Scheduling

We study a classical problem in online scheduling. A sequence of jobs must be scheduled on $m$ identical parallel machines. As each job arrives, its processing time is known. The goal is to minimize the makespan, i.e, the completion time of the last job to finish.

The problem was introduced by Graham in 1966. Graham developed the well-known List scheduling algorithm that schedules every incoming job on the least loaded machine. The List algorithm is $(2- {1\over m})$-competitive. In recent years, research has focused on finding algorithms with a competitive ratio bounded away from 2.

Bartal, Fiat, Karloff and Vohra (1992) gave a deterministic online algorithm that is 1.986-competitive. Karger, Phillips and Torng (1994) generalized the algorithm and proved an upper bound of 1.945. The best lower bound currently known on the competitive ratio that can be achieved by deterministic online algorithms it equal to 1.837.

We present an improved deterministic online scheduling algorithm that is 1.923-competitive, for all $m\geq 2$. The algorithm is based on a new scheduling strategy, i.e., it is not a generalization of the approach by Bartal et al. Also, the algorithm has a simple structure. Furthermore, we develop a better lower bound. We prove that, for general $m$, no deterministic online scheduling algorithm can be better than 1.852-competitive. The contribution of our work is not primarily the improvement in the competitive ratios but the new scheduling algorithm underlying our solution.

## 8 May 1997

Gates Building 498. 4:15 p.m.

## Martin Dietzfelbinger (Universitaet Dortmund)

### The Linear-Array Problem in Communication Complexity Resolved

Tiwari (1987) considered the following scenario: k+1 processors P_0,...,P_k, connected by k links to form a linear array, compute a function f(x,y), for inputs (x,y) from a finite domain X x Y, where x is only known to P_0 and y is only known to P_k; the intermediate processors P_1,...,P_{k-1} initially do not have any information. The processors compute f(x,y) by exchanging binary messages across the links according to some protocol. Let D_k(f) denote the minimal complexity of such a protocol, i.e., the total number of bits sent across all links for the worst case input, and let D(f) = D_1(f) denote the (standard) two-party communication complexity of f.

Tiwari proved that D_k(f) >= k*(D(f)-O(1)) for almost all functions, and conjectured this to be true for all f. His conjecture was falsified by Kushilevitz, Linial, and Ostrovsky (1996): they exhibited a function f for which D_k(f) is essentially bounded above by 0.75*D(f). The best known general lower bound known is D_k(f) >= k*(D(f)^(1/2)-log k-3).

We prove: D_k(f) >= 0.146 * k * D(f), for all functions f and all k >= 2. Applying the general framework provided by Tiwari, one may derive lower bounds for the communication complexity of a function in general asynchronous networks in terms of its two-party communication complexity. Moreover, resolving a longstanding open question, the result shows that the communication complexity of a function directly implies a lower bound for the running time of one-tape Turing machines computing this function.

The nondeterministic and the randomized case are also studied, leading to similar results.

Email: dietzf@noether.informatik.uni-dortmund.de
Related paper: To appear in STOC '97

## 15 May 1997

Gates Building 498. 4:15 p.m.

## Yoram Singer (AT&T Labs)

### Using and Combining the Advice of Experts that Specialize

We study online learning algorithms that predict by combining the predictions of several subordinate prediction algorithms, sometimes called experts.'' These simple algorithms belong to the multiplicative weights family of algorithms. The performance of these algorithms degrades only logarithmically with the number of experts, making them particularly useful in applications where the number of experts is very large. However, in applications such as text categorization, it is often natural for some of the experts to abstain from making predictions on some of the instances. We show how to transform algorithms that assume that all experts are always awake to algorithms that do not require this assumption. We also show how to derive corresponding loss bounds. Our method is very general, and can be applied to a large family of online learning algorithms. Finally, we give applications to various prediction problems such as decision graphs, text categorization, and portfolio selection.

Based on the following papers:

• Freund, Schapire, Singer, Warmuth, (STOC97). Using and Combining Predictors that Specialize
• Cohen, Singer, (SIGIR96). Context-Sensitive Learning Methods for Text Categorization
• Helmbold, Schapire, Singer, Warmuth, (submitted to Mathematical Finance). On-line Portfolio Selection Using Multiplicative Updates
Email:
singer@research.att.com
WWW: http://www.research.att.com/~singer
Related paper: http://www.research.att.com/~singer/papers/sleeping.ps.gz

## 17 July 1997

Gates Building 498. 3:00 p.m.

## Moni Naor (Weizmann Institute)

### Number-Theoretic Constructions of Efficient Pseudo-random Functions and Other Cryptographic Primitives

We describe efficient constructions for various cryptographic primitives (both in private-key and in public-key cryptography), based on the decisional version of the Diffie-Hellman assumption and on the hardness of factoring. Our major result is a new and efficient construction of pseudo-random functions. Computing the value of the function at any given point involves two multiple products (which is comparable with two exponentiations). In particular, the functions are shallow, they can be computed in TC^0 (the class of functions computable by constant depth circuits consisting of a polynomial number of threshold gates). Using the simple algebraic structure of the pseudo-random function, f_s, we show a zero-knowledge proof for statements of the form y=f_s(x)'' and additional features of the functions.

Joint work with Omer Reingold.