# Stanford Theory Seminar

Stanford Theory Seminar (formerly known as Stanford Algorithms Seminar/AFLB) is usually held in Gates 498 or Theory Lounge(Gates 463A)
Faculty Contact: Gregory Valiant
Student Contact: Weihao Kong
Click here for directions.

## September 24, 2015

Gates 463A, 4:15 PM

## Moses Charikar (Stanford)

### Euclidean k-means: an SDP relaxation and hardness of approximation

The Euclidean k-means problem is a classical problem that has been extensively studied in several communities. In this problem, we are given a set of n points in Euclidean space R^d , and the goal is to choose k center points in R^d so that the sum of squared distances of each point to its nearest center is minimized. The best approximation algorithms for this problem include a polynomial time constant factor approximation for general k and a (1+eps)-approximation which runs in time poly(n).exp(k/eps). At the other extreme, the only known computational complexity result for this problem is NP-hardness. The main difficulty in obtaining hardness results stems from the Euclidean nature of the problem, and the fact that any point in R^d can be a potential center. This gap in understanding left open the intriguing possibility that the problem might admit a PTAS for all k, d.

In this talk, I will present a new SDP relaxation for k-means. The study of integrality gaps for this relaxation led to the first hardness of approximation result for the Euclidean k-means problem. We show that there exists a constant delta > 0 such that it is NP-hard to approximate the k-means objective to within a factor of (1+delta). I will outline some of the ideas in the hardness of approximation proof, and discuss other research directions that came from the study of the k-means SDP.

Joint work with Pranjal Awasthi, Ravishankar Krishnaswamy and Ali Sinop.

## October 8, 2015

Gates 463A, 4:15 PM

## Arkadev Chattopadhyay (Tata Institute)

### Topology matters in communication

We study communication cost of computing functions when inputs are distributed among k processors, each of which is located at one vertex of a network/graph called a terminal. Every other node of the network also has a processor, with no input. The communication is point-to-point and the cost is the total number of bits exchanged by the protocol, in the worst case, on all edges.

Our results show the effect of topology of the network on the total communication cost. We prove tight bounds for simple functions like Element-Distinctness (ED), which depend on the 1-median of the graph. On the other hand, we show that for a large class of natural functions like Set-Disjointness the communication cost is essentially n times the cost of the optimal Steiner tree connecting the terminals. Further, we show for natural composed functions like EDXOR and XORED, the naive protocols suggested by their definition is optimal for general networks. Interestingly, the bounds for these functions depend on more involved topological parameters that are a combination of Steiner tree and 1-median costs.

To obtain our results, we use some tools like metric embeddings and linear programming whose use in the context of communication complexity is novel as far as we know.

(Based on joint works with Jaikumar Radhakrishnan and Atri Rudra)

## October 22, 2015

Gates 463A, 4:15 PM

## Persi Diaconis (Stanford)

### Using Algorithms to Prove Theorems

Sometimes, a clever algorithm gives a representation that allows theorems to be proved. I will illustrate with Stam's algorithm for generating random set partitions (so there are five set partitions of three things: 1/2/3, 12/3, 13/2, 23/1, 123) a variety of statistics have resisted analysis (particularly the number of crossings). I hope that this is the tip of an iceberg, but the audience will have to help. This is joint work with Daniel Kane, Bobbie Chern, and Rob Rhodes.

## October 29, 2015

Gates 463A, 4:15 PM

## Prasad Raghavendra (UC Berkeley)

### Lower bounds on the size of semidefinite programming relaxations

We introduce a method for proving lower bounds on the efficacy of semidefinite programming (SDP) relaxations for combinatorial problems. In particular, we show that the cut, TSP, and stable set polytopes on n-vertex graphs are not the linear image of the feasible region of any SDP (i.e., any spectrahedron) of dimension less than 2nc, for some constant c>0. This result yields the first super-polynomial lower bounds on the semidefinite extension complexity of any explicit family of polytopes. Our results follow from a general technique for proving lower bounds on the positive semidefinite rank of a matrix. To this end, we establish a close connection between arbitrary SDPs and those arising from the sum-of-squares SDP hierarchy. For approximating maximum constraint satisfaction problems, we prove that SDPs of polynomial-size are equivalent in power to those arising from degree-O(1) sum-of-squares relaxations. This result implies, for instance, that no family of polynomial-size SDP relaxations can achieve better than a 7/8-approximation for MAX-3-SAT. This is joint work with James Lee (U. Washington) & David Steurer (Cornell Univ).

## November 5, 2015

Gates 463A, 4:15 PM

## Itai Ashlagi (Stanford)

### Unbalanced Random Matching Markets

We study competition in matching markets with random heterogeneous preferences by considering an unequal number of agents on either side. First, we show that even the slightest imbalance yields an essentially unique stable matching. Second, we give a tight description of stable outcomes, showing that matching markets are extremely competitive. Each agent on the short side of the market is matched to one of his top preferences, and each agent on the long side is either unmatched or does almost no better than being matched to a random partner. Our results suggest that any matching market is likely to have a small core, explaining why small cores are empirically ubiquitous.

## November 12, 2015

Gates 463A, 4:15 PM

## Rahul Santhanam (Univeristy of Edinburgh)

### Correlation Lower Bounds and Satisfiability Algorithms for Threshold Circuits with Few Wires

We give the first average-case lower bounds for polynomial time Boolean functions against constant-depth threshold circuits with a super linear number of wires. We show that for each constant d, there is a constant \eps_d such that any depth-d threshold circuit with fewer than n^{1+\eps_d} wires has correlation 1/n^{\Omega(1)} with Parity, and correlation 1/2^{n^{\Omega{1}} with the Generalized Andreev function.

As a consequence of our techniques, we get satisfiability algorithms beating brute force search for threshold circuits with fewer than n^{1+\eps_d} wires. These are the first such satisfiability algorithms for depth d > 2. We also use our ideas to get lower bounds for Parity against super polynomial size AC^0 circuits with n^{o(1)} threshold gates.

Our techniques include random restrictions, anti-concentration and the structural theory of linear threshold functions, and read-k Chernoff bounds.

Joint work with Ruiwen Chen and Srikanth Srinivasan.

## Janurary 21, 2016

Gates 463A, 4:15 PM

## Alan Roytman (Tel Aviv)

### Zero-One Laws for Sliding Windows and Universal Sketches

Streaming algorithms serve an important role in analyzing vast amounts of data, especially when it is infeasible to store a large part of the input in memory. Given this limited amount of storage, a typical approach is to design a sophisticated algorithm that computes a specific statistic in small space. In this talk, we consider the following fascinating possibility: can we compute many statistics with a single algorithm? This question is even more challenging in the sliding window streaming model, where statistics are only computed over recent data.

Our main result is the design of a “universal” streaming algorithm in the sliding window model that can simultaneously approximate many functions (i.e., statistics) from a broad class of frequency-based monotonically increasing functions. That is, we construct asingle summary of the streaming data such that, for any function G taken from the class, the summary can be used to approximate the value \sum_{i=1}^n G(m_i) (here, m_i represents the frequency that element i from a universe of size n appears in the window).

In addition, we give a zero-one law for the class of monotonically increasing functions G that specifies when the summation \sum_{i=1}^n G(m_i) can be approximated in the sliding window model. That is, we define a certain set of properties such that, if a given function G satisfies these conditions, we provide an algorithm that approximates the sum using polylogarithmic memory. Otherwise, we give a super-polylogarithmic lower bound on the memory required for such an approximation.

Joint work with Vladimir Braverman and Rafail Ostrovsky.

## Janurary 29, 2016

Gates 463A, 3:00 PM

## Dana Moshkovitz (MIT)

### Amplification and Derandomization Without Slowdown

We show techniques for decreasing the error probability of randomized algorithms and for converting randomized algorithms to deterministic (non-uniform) algorithms. Unlike most existing techniques that involve repetition of the randomized algorithm and hence a slowdown, our techniques produce algorithms with a similar run-time to the original randomized algorithms.

The amplification technique is related to a certain stochastic multi-armed bandit problem. The derandomization technique -- which is the main contribution of this work -- points to an intriguing connection between derandomization and sketching/sparsification.

We demonstrate the techniques by showing applications to max-cut on dense graphs, approximate clique, constraint satisfaction problems on dense bipartite graphs, and list decoding to unique decoding for Reed-Muller code.

This is joint work with Ofer Grossman.

## February 18, 2016 and February 19, 2016

Room 200-002, 3:00--4:30 PM

## Laszlo Babai (University of Chicago)

### Graph Isomorphism in Quasipolynomial Time

The algorithm indicated in the title builds on Luks's classical framework and introduces new group theoretic and combinatorial tools.

The first talk will give an overview of the algorithm, outline the core group theoretic routine ("Local Certificates"), and sketch the aggregation of the local certificates.

The second talk will outline the combinatorial partitioning techniques ("Design Lemma" and "Split-or-Johnson") required for the group-theoretic recurrence.

Elements of undergraduate-level group theory such as facility with the concepts involved in the Jordan--Holder Theorem and basic concepts of permutation groups (such as orbits) will be assumed.

The paper is available at arXiv:1512.03547.

Helpful reading:

E.M. Luks : Isomorphism of graphs of bounded valence can be tested in polynomial time. J. Comp. Sys. Sci., 25:42--65, 1982.

## March 3, 2016

Gates 463A, 4:15 PM

## Amit Daniely (Google)

### Complexity theoretic limitations on learning

We introduce a new method to reduce random CSP problems to learning problems. Under a natural generalization of Feige's assumption about random K-SAT and K-XOR, this method substantially reduces the gaps between known upper and lower bounds. Concretely, it follows that:

1. Learning DNFs is hard.
2. Learning an intersection of super logarithmic number of halfspaces is hard.
3. Learning Halfspaces in the presence of a tiny amount of noise is hard.
4. Several additional learning problems are hard. In particular, virtually all problems that were previously shown intractable (under various, mostly cryptographic, assumptions).

Based on joint work with Nati Linial and Shai Shelev-Shwartz

## March 10, 2016

Gates 463A, 4:15 PM

## Merav Parter (MIT)

### MST in Log-Star Rounds of Congested Clique

We present a randomized algorithm that computes a Minimum Spanning Tree (MST) in O(log^* n) rounds, with high probability, in the Congested Clique model of distributed computing. In this model, the input is a graph on n nodes, initially each node knows only its incident edges, and per round each two nodes can exchange O(log n) bits.

Our key technical novelty is an O(log^* n) Graph Connectivity algorithm, the heart of which is a (recursive) forest growth method, based on a combination of two ideas: a sparsity-sensitive sketching aimed at sparse graphs and a random edge sampling aimed at dense graphs.

Our result improves significantly over the $O(\log \log \log n)$ algorithm of Hegeman et al. [PODC 2015] and the $O(\log \log n)$ algorithm of Lotker et al. [SPAA 2003; SICOMP 2005].

This join work with Mohsen Ghaffari, MIT, CSAIL.

## March 17, 2016

Gates 463A, 4:15 PM

## Sarah Allen (CMU)

### How to refute a random CSP

Let P be a nontrivial k-ary predicate over a finite alphabet. Consider a random CSP(P) instance I over n variables with m constraints, where each constraint is P applied to k random literals. When m >> n the instance I will be unsatisfiable with high probability, and the natural associated algorithmic task is to find a refutation of I — i.e., a certificate of unsatisfiability. When P is the 3-ary Boolean OR predicate, this is the well studied problem of refuting random 3-SAT formulas; in this case, an efficient algorithm is known only when m >> n^{3/2}. Understanding the density required for average-case refutation of other predicates is of importance for various areas of complexity, including cryptography, proof complexity, and learning theory. The main previously-known result is that for a general Boolean k-ary predicate P, having m >> n^{ceil(k/2)} random constraints suffices for efficient refutation. We give a general criterion for arbitrary k-ary predicates, one that often yields efficient refutation algorithms at much lower densities. Specifically, if P fails to support a t-wise independent (uniform) probability distribution, then there is an efficient algorithm that refutes random CSP(P) instances I with high probability, provided m >> n^{t/2} . Indeed, our algorithm will “somewhat strongly” refute I, certifying Opt(I) <=1 − Omega(1); if t = k then we furthermore get the strongest possible refutation, certifying Opt(I) <= E[P] + o(1). This last result is new even in the context of random k-SAT. As an application of our result, we falsify the “SRCSP assumptions” used to show various hardness-of-learning results in the recent (STOC 2014) work of Daniely, Linial, and Shalev–Shwartz.

This is joint work with Ryan O’Donnell and David Witmer.

## April 14, 2016

Gates 463A, 4:15 PM

## Omer Reingold (Samsung Research America)

### Constant-round Interactive-proofs for Delegating Computation

Interactive proofs, introduced by Goldwasser, Micali and Rackoff, have had a dramatic impact on Complexity Theory and Cryptography. In particular, the celebrated IP=PSPACE Theorem [LFKN92,Shamir92] allows an all-powerful but untrusted prover to convince a polynomial-time verifier of the validity of extremely complicated statements (as long as they can be evaluated using polynomial space). The interactive proof system designed for this purpose requires a large number of communication rounds and heavy computation for generating the proof.

We introduce new interactive proofs that are very efficient in the number of rounds and computation time, that are particularly well suited for delegating bounded-space computations (e.g., in the context of cloud computing). Our main result is that for every statement that can be evaluated in polynomial time and bounded-polynomial space there exists an interactive proof that satisfies the following strict efficiency requirements:

(1) the honest prover runs in polynomial time, (2) the verifier is almost linear time (and under some conditions even sub linear), and (3) the interaction consists of only a constant number of communication rounds. Prior to this work, very little was known about the power of efficient, constant-round interactive proofs.

Joint work with Guy Rothblum and Ron Rothblum

## May 5, 2016

Gates 463A, 4:15 PM

## Tselil Schramm (UC Berkeley)

### Refuting random CSPs below the spectral threshold

Random instances of 3SAT are known to be unsatisfiable with high probability when there at least 5N clauses. However, given a random 3SAT instance on N variables,the task of refuting the instance, or of proving that the instance has no satisfying assignments, is hypothesized to be computationally hard if there are O(N) clauses--in fact, the best known algorithms for refutation require instances with at least Ω(N^{3/2}) clauses, a factor of N^{1/2} beyond the unsatisfiability threshold at O(N).

In this talk, I will describe a new spectral algorithm that refutes 3SAT instances with fewer clauses, given more time. Specifically, our algorithm refutes instances with Θ(N^{3/2 - delta/2}) clauses in exp(O(N^delta)) time, giving the same guarantees as the best known polynomial-time algorithms when delta = 0, and from there smoothly approaching the unsatisfiability threshold at delta = 1. Further, our algorithm strongly refutes the instances, certifying that no assignment satisfies more than a (1-epsilon)-fraction of constraints for a constant epsilon.

## May 12, 2016

Gates 463A, 4:15 PM

## Sasha Golovnev (New York University)

### Generalizations of the Gate Elimination Method

In this talk we consider Boolean circuits over the full binary basis. It is known that almost all Boolean functions of n variables require circuits of size 2^n/n. At the same time, the best known lower bound of 3n-o(n) for an explicit Boolean function (that is, a function from NP) was given by Blum in 1984. This bound, as well as most of the previous bounds, is based on a simple method called gate elimination. The idea of the method is as follows: roughly, to prove a lower bound of cn for a function from a certain class one shows that in any circuit computing a function from the class one can find a substitution of a constant to an input variable that eliminates at least c gates from a circuit and keeps the function in the same class; the bound then follows by induction.

In this talk, we discuss generalizations of gate elimination: we consider more involved substitutions, circuit complexity measures, and generalized circuit models that simplify and improve the analysis.

We show that affine dispersers are not computable by circuits of size 3.01n. We also show that an explicit construction of dispersers for quadratic varieties (currently unknown) implies stronger lower bounds.

The talk is based on joint works with Magnus G. Find, Edward A. Hirsch, and Alexander S. Kulikov:

## May 19, 2016

Gates 463A, 4:15 PM

## Seth Pettie (UMich, Ann Arbor)

### A Brief History of O(m sqrt(n)) Time Cardinality Matching Algorithms

This talk will survey the strange history of a simple claim: that a maximum cardinality matching in a non-bipartite graph can be computed in O(m sqrt(n)) time. This claim has been made across five decades, by Even & Kariv (1975), Bartnik (1978), Micali & Vazirani (1980), Blum (1990, 2015), Gabow & Tarjan (1991), Fremuth-Paeger & Jungnickel (1999-2003), Goldberg & Karzanov (2004), and Vazirani (1994, 2012-13).

I will present a simplified analysis of the Gabow-Tarjan algorithm, and argue that it is probably the best suited for presentation in introductory algorithms courses.

## May 26, 2016

Gates 463A, 4:15 PM

## Parikshit Gopalan (VMware Research)

### Degree and Sensitivity: Tails of two distributions

The sensitivity of a Boolean function f on the hypercube is the maximum, over all inputs x, of the number of sensitive coordinates of x. The well-known sensitivity conjecture of Nisan states that every sensitivity-s Boolean function can be computed by a polynomial of degree poly(s) (independent of the number of variables). The best known upper bounds on degree, however, are exponential rather than polynomial in s.

I will sketch the proof of an approximate version of this statement: every low sensitivity function can be well-approximated by low-degree polynomials (in L_2). The proof analyzes the combinatorial structure of the graph of sensitive edges of a Boolean function f, and its behavior under random restrictions. We show that the graph of a function of full degree must be sufficiently complex, and that random restrictions of low-sensitivity functions are unlikely to yield complex graphs, by introducing and analyzing some new complexity measures (which might be independently interesting).

Finally, we present a robust version of the sensitivity conjecture and discuss possible approaches to this conjecture.

Based on joint work with Rocco Servedio (Columbia), Avishay Tal (IAS) and Avi Wigderson (IAS).

Gates 463A, 4 PM

## Tengyu Ma (Princeton)

### Polynomial-time tensor decompositions with sum-of-squares

Tensor decompositions have been the key algorithmic components in provable learning of a wide range of hidden variable models such as topic models, Gaussian mixture models, independent component analysis, dictionary learning. One of the challenges in this area is to decompose over-complete low-order tensors robustly.

In this talk I will present new algorithms based on the sum-of-squares method for tensor decomposition. Our results improve the best known running times from quasi-polynomial to polynomial for several problems, including decomposing random overcomplete 3-tensors and learning overcomplete dictionaries with constant relative sparsity. We also give the first robust analysis for decomposing overcomplete 4-tensors in the smoothed analysis model.

A key ingredient of our analysis is to establish small spectral gaps in moment matrices derived from solutions to sum-of-squares relaxations. To enable this analysis we augment sum-of-squares relaxations with spectral analogs of maximum entropy constraints.

Based on joint work with Jonathan Shi and David Steurer.