Stanford Theory Seminar (formerly known as Stanford Algorithms Seminar/AFLB) is usually held in Gates 498 or Theory Lounge(Gates 463A)

Faculty Contact: Ryan Williams

Student Contact: Kevin Lewi

Click here for directions.

Abstract: A seller has m indivisible goods, and is faced with multiple buyers which have combinatorial valuations. The seller needs to select between auctioning all the goods in m independent first price auctions, or selling them sequentially in first price auctions (in the latter case she can also choose the order). What would be the optimal strategy for her? We assume a full information model, and analyze the revenue and welfare of the two auction formats in different settings.

Based on joint works with Haim Kaplan, Yishay Mansour and Noam Nisan.

We study the subgraph counting problem in data streams. We provide the first non-trivial estimator for approximately counting the number of occurrences of an arbitrary subgraph H of constant size in a (large) graph G. Our estimator works in the turnstile model, i.e., can handle both edge-insertions and edge-deletions, and is applicable in a distributed setting. Prior to this work, only for a few non-regular graphs estimators were known in case of edge-insertions, leaving the problem of counting general subgraphs in the turnstile model wide open. We further demonstrate the applicability of our estimator by analyzing its concentration for several graphs H and the case where G is a power law graph.

The study of combinatorial problems with a submodular objective has attracted much attention in recent years, and is partly motivated by the importance of such problems to economics, algorithmic game theory and combinatorial optimization. Additionally, submodular functions also play a major role in combinatorics, graph theory and combinatorial optimization. A partial list of well-known problems captured by submodular maximization includes: Max-Cut, Max-k-Cover, Generalized-Assignment, Max-Bisection, Max Facility-Location, several variants of Max-SAT and some welfare problems.

We present improved approximation guarantees in both the unconstrained and the constrained settings. For the Unconstrained Submodular Maximization problem, we present a simple linear-time algorithm that achieves an information-theoretic tight approximation of 1/2. In the constrained setting, it is known that the main bottleneck is the computation of a fractional solution to the multilinear relaxation of the problem at hand. We present a unified algorithm that computes fractional solutions for this relaxation, that works for both monotone and non-monotone objectives. Some notable immediate implications are information-theoretic tight approximations for Submodular Max-SAT and Submodular-Welfare with k players, for any number of players k, and an improved (1/e)-approximation for maximizing a non-monotone submodular function subject to a matroid or O(1)-knapsack constraints.

We develop a framework for proving lower bounds on computational problems over distributions, including optimization and unsupervised learning. Our framework is based on defining a restricted class of algorithms, called statistical algorithms, that instead of accessing samples from the input distribution can only obtain an estimate of the expectation of any given function on a sample drawn randomly from the input distribution. Our definition captures many natural algorithms used in theory and practice, e.g. moments-based methods, local search, MCMC and simulated annealing. Our techniques are inspired by (and generalize) the statistical query model in learning theory, which captures the complexity of PAC learning using essentially all known learning methods (Kearns, JACM 1998).

For specific well-known problems over distributions, we give lower bounds on the complexity of any statistical algorithm. These include an exponential lower bounds for moment maximization in R^n, and a nearly optimal lower bound for finding a planted bipartite clique when the clique has size $O(n^{1/2-\delta})$ for any constant $\delta > 0$. Variants of the latter problem have been assumed to be hard to prove hardness for other problems and for cryptographic applications. Our lower bounds provide the first concrete evidence supporting these assumptions. Joint work with Elena Grigorescu, Lev Reyzin, Santosh Vempala and Ying Xiao.

Abstract: There has been much progress recently on improved approximations for problems involving submodular objective functions, many interesting techniques have been developed. However, the resulting algorithms are often slow and impractical. In this work we develop general techniques to get very fast approximation algorithms for maximizing submodular functions subject to various constraints. These include speeding greedy and continuous greedy based algorithms and a new potential function based local search algorithm to handle multiple constraints.

(Based on joint work with Jan Vondrak)

Abstract: We consider a fundamental problem in unsupervised learning: We are given a collection of m points in n dimensions such that all but some fraction of outliers are contained in a lower dimensional subspace. The goal is to recover the subspace. This problem has received considerable attention in computer science and in statistics. Yet efficient algorithms from computer science are not robust to adversarial outliers, and the estimators from robust statistics are hard to compute in high dimensions. This is a serious and persistent issue not just in this application, but for many other problems in unsupervised learning. Are there algorithms for this problem that are both robust to outliers and efficient? We narrow down the answer to this question giving new algorithms and hardness results. Our results are based on new connections to other areas including small set expansion, matroid theory and functional analysis.

Joint work with Ankur Moitra.

Abstract: Polynomial Identity Testing (PIT) is the problem of identifying whether a given algebraic circuit computes the identically zero polynomial. It is well-known that this problem can be solved with small error probability by testing whether the circuit evaluates to zero on random evaluation points. Recently, there has been much interest in solving this problem deterministically, because it has close connections with circuit lower bounds, and this problem is now one of the frontiers of the area of pseudorandomness.

The method of partial derivatives is a fairly successful technique for understanding algebraic computation. Nisan used this method to show exponential lower bounds for non-commutative branching programs, a natural model of computation at least as powerful as formulas. Later, Raz and Shpilka used the method develop a polynomial-time "white box" PIT algorithm for non-commutative branching programs, where this algorithm can "look" at the structure of the branching program. In this work, we extend the partial derivative method to the "black box" PIT setting for read-once oblivious branching programs, by deriving a quasi-polynomial-time algorithm that solves PIT by only evaluating the branching program at chosen locations. As a corollary, we also derive similar results for non-commutative branching programs.

Joint work with Amir Shpilka

Abstract: The complexity of evaluating a formula in predicate logic highly depends on the considered formula. Evaluating a formula from monadic second-order (MSO) logic can be NP-complete, while evaluating any first-order (FO) formula can be done in P. Another aspect that influences the complexity of evaluating formulas is the kind of input structures considered. Any MSO-formula can be solved by a linear-time algorithm on structures of bounded tree width, and the same bound holds for any FO-formula and structures whose tree width is bounded locally. While these upper bounds are optimal from the algorithmic point of view, in my talk I will outline an ongoing effort to find optimal upper bounds from the complexity-theoretic point of view. That means, to refine polynomial-time bounds in terms of circuit and space-efficient computations. Many problems that are studied in the realm of complexity classes like TC0, NC1, L, and LOGCFL are MSO-definable. Using their MSO-definitions not only unifies these problems, but also makes it easier to prove upper bounds via techniques for balancing tree decompositions and automata-based methods.

Abstract: We improve the running times of algorithms for least squares regression and low-rank approximation to account for the sparsity of the input matrix. Namely, if nnz(A) denotes the number of non-zero entries of an input matrix A: - we show how to solve approximate least squares regression given an n x d matrix A in nnz(A) + poly(d) time - we show how to find an approximate best rank-k approximation of an n x n matrix in nnz(A) + n*poly(k log n) time

All approximations are relative error. Previous algorithms based on fast Johnson-Lindenstrauss transforms took at least ndlog d or nnz(A)*k time.

Joint work with Ken Clarkson.

The problem of finding dense subgraphs in a given graph arises in a number of settings such as community mining, link spam detection and mining gene annotations. This talk will focus on the densest k-subgraph problem (i.e. find a size k subgraph with maximum number of edges). This has frustrated the development of good algorithms and yet resisted attempts to prove hardness of approximation results. In fact the assumption that the problem is hard has found novel uses.

I will survey some of the results that are known for this densest subgraph problem. The algorithmic results (for worst case analysis) are inspired by studying an average case version of the problem on random inputs. This suggests a simple distribution on inputs which beats all current algorithmic techniques and seems to be a barrier for further progress. I will give evidence that the problem is difficult by showing that significantly better results for the problem cannot be obtained from powerful families of techniques based on mathematical programming relaxations.

Based on joint work with Aditya Bhaskara, Eden Chlamtac, Uriel Feige, Venkatesan Guruswami, Aravindan Vijayaraghavan, and Yuan Zhou.

Abstract: We show optimal (up to constant factor) NP-hardness for Max-k-CSP over any domain, whenever k is larger than the domain size. This follows from our main result concerning predicates over abelian groups. We show that a predicate is approximation resistant if it contains a subgroup that is balanced pairwise independent. This gives an unconditional analogue of Austrinâ€“Mossel hardness result, bypassing the Unique-Games Conjecture for predicates with an abelian subgroup structure.

Our main ingredient is a new soundness reduction technique inspired by XOR-lemmas. Using this technique, we also improve the NP-hardness of approximating Independent-Set on bounded-degree graphs, Almost-Coloring, Two-Prover-One-Round-Game, and various other problems.

The decomposition theorems are inspired by the breakthrough work of Chuzhoy on the maximum edge-disjoint paths problem in undirected graphs. The goal of the talk to explain the background for the theorems and their application to routing and to fixed-parameter tractability and Erdos-Posa-type theorems. The latter applications allow one to bypass the well-known Grid-Minor theorem of Robertson and Seymour.

The decomposition theorems are from a paper joint with Julia Chuzhoy that is to appear in STOC 2013 but the talk is also based on several previous papers on the maximum disjoint paths problem.

Abstract: Fix F to be a finite field of prime order. An affine-invariant property is a property of functions on F^n that is closed under taking affine transformations of the domain. We prove that all affine-invariant properties having localcharacterizations are testable. In fact, we show a proximity-oblivious test for any such property P, meaning that there is a test that, given an input function f, makes a constant number of queries to f, always accepts if f satisfies P, and rejects with positive probability if the distance between f and P is nonzero. More generally, we show that any affine-invariant property that is closed under taking restrictions to subspaces and has bounded complexity is testable.

Our results depend on a new Gowers inverse theorem by Tao and Ziegler for low characteristic fields that decomposes any polynomial with large Gowers norm into a function of low-degree non-classical polynomials. We establish a new equidistribution result for high rank non-classical polynomials that drives the proofs of both the testability results and the local characterization of degree-structural properties.

Joint work with Eldar Fischer, Hamed Hatami, Pooya Hatami, and Shachar Lovett.

For $d$-dimensional point sets $A, B$ $|A|=|B|=n$, I will present an algorithm to compute an $\eps$-approximate minimum cost perfect matching of $A$ and $B$ in near-linear time; here the cost of an edge is the Euclidean distance between its end-points. Our algorithm iteratively generates approximate minimum-cost augmenting paths in time proportional to its length. We show that the total length of the augmenting paths produced by our algorithm is $O((n/\eps) \log n)$ implying a near-linear running time.

Joint work with Pankaj Agarwal.

Abstract: In this talk we will discuss multiparty communication complexity in the message-passing model, where we have k machines, each having a piece of data and they want to jointly compute some function defined on the union of the k data sets via communication. The communication is point-to-point, and the goal is to minimize the total communication between the k sites. This model captures all point-to-point distributed computational models for Big Data in terms of communication complexity, including the BSP model and the MapReduce model. Problems we consider in this model include statistical & graph problems, numerical linear algebra and database queries.

In this talk we will introduce three techniques developed in the last two years for proving communication lower bounds in the message-passing model, each via a concrete example. We believe that these techniques will have a wide applicability. At the end of the talk, we will discuss several future directions, in both the problem space and the technique space.

Geometric Complexity Theory (GCT) is an approach towards P versus NP and other lower bounds using algebraic geometry and representation theory. In this talk, we discuss how many known lower bounds can be viewed as following the part of the GCT approach outlined in [GCT I, II, SIAM J. Comput.]. This includes the best known bounds on permanent versus determinant, lower bounds on constant depth circuits, the partial derivatives technique, the connected components technique, the degree bound, matrix rigidity, and others. In particular, we expose the representation-theoretic content of these proofs, introducing GCT as a natural generalization of known methods. No knowledge of algebraic geometry or representation theory is assumed â€“ rather, we motivate the use and definitions of representation theory via these previous complexity-theoretic proofs.

Abstract: We reduce non-deterministic time T > 2^n to a 3SAT instance phi of size |phi| = T polylog T such that there is an explicit circuit C that on input an index i of log |phi| bits outputs the i-th clause, and each output bit of C depends on O(1) inputs bits. The previous best result was C in NC^1. Even in the simpler setting of |phi| = poly(T) the previous best result was C in AC^0.

More generally, for any time T > n and parameter r we obtain log_2 |phi| = max(log T, n/r) + O(log n) + O(loglog T) and each output bit of C is a decision tree of depth O(log r).

As an application, we simplify the proof of Williams' ACC^0 lower bound, and tighten his connection between satisfiability algorithms and lower bounds.

Authors: Hamid Jahanjou, Eric Miles, Emanuele Viola