Stanford Theory Seminar

Stanford Theory Seminar (formerly known as Stanford Algorithms Seminar/AFLB) is usually held in Gates 498 or Theory Lounge(Gates 463A)
Faculty Contact: Gregory Valiant
Student Contact: Weihao Kong
Click here for directions.

June 1, 2017 4:15pm

Encina Hall, 616 Serra Mall, Stanford 94305 Bechtel International Conference Center

Piotr Indyk (MIT)

Beyond P vs. NP: Quadratic-Time Hardness For Big Data Problems

The theory of NP-hardness has been very successful in identifying problems that are unlikely to be solvable in polynomial time. However, many other important problems do have polynomial time algorithms, but large exponents in their time bounds can make them run for days, weeks or more. For example, quadratic time algorithms, although practical on moderately sized inputs, can become inefficient on big data problems that involve gigabytes or more of data. Although for many problems no sub-quadratic time algorithms are known, any evidence of quadratic-time hardness has remained elusive. In this talk I will give an overview of recent research that aims to remedy this situation. In particular, I will describe hardness results for problems in string processing (e.g., edit distance computation or regular expression matching) and machine learning (e.g., Support Vector Machines or gradient computation in neural networks). All of them have polynomial time algorithms, but despite extensive amount of research, no near-linear time algorithms have been found. I will show that, under a natural complexity-theoretic conjecture, such algorithms do not exist. I will also describe how this framework has led to the development of new algorithms for some variants of these problems.

May 25, 2017

Gates 463A, 4:15PM

Kent Quanrud (UIUC)

Near-Linear Time Approximation Schemes for some Implicit Fractional Packing Problems

We consider several implicit fractional packing problems and obtain faster implementations of approximation schemes by a general framework based on multiplicative-weight updates. This leads to new algorithms with near-linear running times for some fundamental problems in optimization. We highlight three concrete applications. The first is to find the maximum fractional packing of spanning trees in a capacitated graph; we obtain a $(1 - \epsilon)$-approximation in $\tilde{O}(m / \epsilon^2)$ time, where $m$ is the number of edges in the graph. Second, we consider the LP relaxation of the weighted unsplittable flow problem on a path and obtain a $(1- \epsilon)$-approximation in $\tilde{O}(n/\epsilon^2)$ time, where $n$ is the number of demands. Third, given an undirected edge weighted graph on $m$ edges, we obtain a $(1+\epsilon)$-approximation to the Held-Karp bound for the Metric-TSP instance induced by the shortest path metric in the graph in $\tilde{O}(m/\epsilon^2)$ time. Each of these algorithms give an order of magnitude improvement over previous results.

This is joint work with Chandra Chekuri.

May 24, 2017

Gates 463A, 4:15PM

Pravesh Kothari (Princeton)

Quantum Entanglement, Sum-of-Squares, and the Log-Rank Conjecture

In this talk, I will show a $\exp(\sqrt(n) \mathsf{poly} \log{n})$-time algorithm for the Best Separable State (BSS) problem (see next paragraph for a classical algorithmic formulation): given an $n^2 \times n^2$ quantum measurement matrix $M$, distinguish between the cases that there is a separable (i.e., non-entangled) state $\rho$ that M accepts with probability $1$, and the case that every separable state is accepted with probability at most 0.99. Aside from being a central question in quantum information theory arising in the study of entanglement, recent works have uncovered potentially useful connections between BSS and fundamental problems in classical algorithm design such as Max-Cut, Small-Set-Expansion, and Unique Games.

Equivalently, the algorithm takes input a subspace $W \subseteq \mathbb{R}^{n^2}$ and distinguishes between the case that $W$ contains a rank one matrix and the case that every rank one matrix is at least $\epsilon$ far (in the Euclidean distance) from $W$.

At the heart of our result is a general rounding paradigm that uses polynomial reweightings to round the solution to the Sum-of-Squares (SoS) semidefinite programming relaxations. Somewhat surprisingly, the construction of such a polynomial reweighting scheme uses an argument inspired by the recent breakthrough on log-rank conjecture by Lovett (STOC’14, JACM’16) who showed that the communication complexity of every rank-n Boolean matrix is bounded by $\sqrt{n} poly log(n)$.

I will assume no specialized background in the talk.

Based on joint work with Boaz Barak (Harvard) and David Steurer (Cornell, IAS).

May 18, 2017

Gates 463A, 4:15PM

Nika Haghtalab (CMU)

Oracle-efficient Online Learning and Applications to Auction Design

We consider the fundamental problem of learning from expert advice, a.k.a online no-regret learning, where we have access to an offline optimization oracle that can be used to compute, in constant time, the best performing expert at any point in time. We consider the design of no-regret algorithms that are computationally efficient using such an oracle. We present structural properties under which we show oracle-efficient no-regret algorithms exist, even when the set of experts is exponentially large in a natural representation of the problem. Our algorithm is a generalization of the Follow-The-Perturbed-Leader algorithm of Kalai and Vempala that at every step follows the best-performing expert subject to some perturbations. Our design uses a shared source of randomness across all experts that can be efficiently implemented by using an oracle on a random modification of the history of the play at every time step.

Our second main contribution is showing that the structural properties required for our oracle-efficient online algorithm are present in a large class problems. As an example, we discuss applications of our oracle-efficient learning results to the adaptive optimization of a large class of auctions, including (1) VCG auctions with bidder-specific reserves in single-parameter settings, (2) envy-free item pricing in multi-item auctions, and (3) Myerson-like auctions for single-item settings.

Bio: Nika Haghtalab is a Ph.D. student at the Computer Science department of Carnegie Mellon University, co-advised by Avrim Blum and Ariel Procaccia. Her research interests include machine learning theory, computational aspects of economics, and algorithms. Nika is a recipient of the IBM and Microsoft Research Ph.D. fellowships.

May 18, 2017

Gates 463A, 3PM

Sam Hopkins (Cornell)

Sample-optimal inference, the method of moments, and community detection

We propose a simple and efficient meta-algorithm for Bayesian estimation problems (i.e. hidden variable, latent variable, or planted problems). Our algorithm uses low-degree polynomials together with new and highly robust tensor decomposition methods. We focus on the question: for a given estimation problem, precisely how many samples (up to low-order additive terms) do polynomial-time algorithms require to obtain good estimates of hidden variables? Our meta-algorithm is broadly applicable, and achieves statistical or conjectured computational sample-complexity thresholds for many well-studied problems, including many for which previous algorithms were highly problem-specific.

As a running example we employ the stochastic block model -- a widely studied family of random graph models which contain latent community structure. We recover and unify the proofs of the best-known sample complexity bounds for the partial recovery problem in this model. We also give the first provable guarantees for partial recovery of community structure in constant-degree graphs where nodes may participate in many communities simultaneously. This model is known to exhibit a sharp sample complexity threshold -- with fewer than a very specific number of samples, recovering community structure becomes impossible. While previous explanations for this phenomenon appeal to sophisticated ideas from statistical mechanics, we give a new and simple explanation based on properties of low-degree polynomials.

Joint work with David Steurer.

May 11, 2017

Gates 463A, 4:15PM

Dan Alistarh (ETH Zurich & IST Austria)

Data Structures of the Future: Concurrent, Optimistic, and Relaxed

A central computing trend over the last decade has been the need to process increasingly larger amounts of data as efficiently as possible. This development is challenging both software and hardware design, and is altering the way data structures are constructed, implemented, and deployed.

In this talk, I will present examples of such new data structure design ideas and implementations. In particular, I will discuss some inherent limitations of parallelizing classic data structures, and then focus on approaches to circumvent these limitations. The first approach is to relax the software semantics, to allow for approximation, randomization, or both. The second is to modify the underlying hardware architecture to unlock more parallelism. Time permitting, I will also cover results showing that both approaches can improve real-world performance, and touch upon some of the major open questions in the area.

Short bio: Dan Alistarh is an Assistant Professor at IST Austria, currently visiting ETH Zurich on an SNF Ambizione Fellowship. Previously, he was a Researcher at Microsoft Research, Cambridge, UK, and a Postdoctoral Associate at MIT CSAIL. He received his PhD from the EPFL, under the guidance of Prof. Rachid Guerraoui. His research focuses on distributed algorithms and concurrent data structures, and spans from algorithms and lower bounds to practical implementations.

March 2, 2017

Gates 463A, 4:15 PM

Shachar Lovett (UCSD)

A robust analogue of the sensitivity conjecture

The sensitivity conjecture is a famous open problem in the theory of boolean functions. Let f be a boolean function defined on the hypercube. The sensitivity of a node x is the number of its neighbours in the hypercube, for which f give the opposite value as that it does to x. The sensitivity conjecture speculates that if all nodes have low sensitivity, then the function f must be simple. Concretely, all its Fourier mass is supported on levels with low hamming weight.

Gopalan et al [CCC 2016] conjectured a robust analogue of the sensitivity conjecture: if most of the nodes have low sensitivity, then most of the Fourier mass is supported on levels with low hamming weight. They proved it under the stronger assumption, that all nodes have low sensitivity. In this work, we prove the robust sensitivity conjecture with near optimal quantitative bounds. This provides new insights on the relation between smoothness and the Fourier structure of boolean functions.

The talk does not assume any previous knowledge.

Joint work with Avishay Tal (IAS) and Jiapeng Zhang (UCSD).

February 9, 2017

Gates 463A, 4:15 PM

Fan Wei (Stanford)

Polynomial complexity for local MAX-CUT algorithm

In 1988, Johnson, Papadimitriou and Yannakakis wrote that ``Practically all the empirical evidence would lead us to conclude that finding locally optimal solutions is much easier than solving NP-hard problems". Since then the empirical evidence has continued to amass but formal proofs of this phenomenon have remained elusive. A canonical (and indeed complete) example is the local max-cut problem for which no polynomial time method is known. In a breakthrough paper, Etscheid and R{\"o}glin proved that the {\em smoothed} complexity of local max-cut is quasi-polynomial. In this paper we prove smoothed polynomial complexity for local max-cut, thus confirming that finding locally optimal solutions for max-cut is much easier than solving it.

This is a joint work with Omer Angel, Sebastien Bubeck, and Yuval Peres.

February 2, 2017

Gates 463A, 4:15 PM

Andrej Risteski (Princeton)

How to calculate partition functions using convex programming hierarchies: provable bounds for variational methods

The problem of approximating partition functions for graphical models is a very fundamental one: from a practical point of view it is linked intimately to common tasks like calculating marginals in the model and sampling from the model; from a theoretical it is linked to understanding the class #P, which intuitively captures counting problems.

We give provable bounds for algorithms that are based on variational methods: a technique which is very popular in practice but rather poorly understood theoretically in comparison to MCMC methods.

We make use of recent tools in combinatorial optimization: the Sherali-Adams and Lasserre convex programming hierarchies to get algorithms for "dense" Ising models. These techniques give new, non-trivial approximation guarantees for the partition function beyond the regime of correlation decay. They also generalize some classical results from statistical physics about the Curie-Weiss ferromagnetic Ising model, as well as provide a partition function counterpart of classical results about max-cut on dense graphs. With this, we connect techniques from two apparently disparate research areas -- optimization and counting/partition function approximations.

Time permitting, we also will show worst-case guarantees for coarse (multiplicative) approximations to the log-partition functions in the worst-case setting (i.e. for arbitrary Ising models).

Janury 26, 2017

Gates 463A, 4:15 PM

Eric Price (UT Austin)

Fourier Sparsity, Polynomials, and Fast Interpolation

In recent years, a number of works have studied methods for computing the Fourier transform in sublinear time if the output is sparse. Most of these have focused on the discrete setting, even though in many applications the input signal is continuous and naive discretization significantly worsens the sparsity level.

In the continuous setting, if the signal is sampled over a duration T, it is impossible to robustly identify frequencies that lie too close to each other (within 1/T). We show that one can achieve this bound to within a log factor while maintaining constant approximation factor and good sample complexity. We furthermore show that the barrier does not apply to estimating the signal as a whole: we give a sample-efficient algorithm to robustly estimate Fourier-sparse signals, regardless of the frequency gap.

As a special case, we get an algorithm to interpolate degree d polynomials from noisy measurements, using O(d) samples and O~(d) time, while increasing the error by a constant factor in L2.

Joint work with Xue Chen, Daniel Kane, and Zhao Song. Based on https://arxiv.org/abs/1609.00896 and https://arxiv.org/abs/1609.01361 .

Janury 19, 2017

Gates 463A, 4:15 PM

David Karger (MIT)

A Fast and Simple Unbiased Estimator for Network (Un)reliability

The following procedure yields an unbiased estimator for the disconnection probability of an nn-vertex graph with minimum cut c if every edge fails independently with probability p: (i) contract every edge independently with probability 1−n^{−2/c}, then (ii) recursively compute the disconnection probability of the resulting tiny graph if each edge fails with probability pn^{2/c}. We give a short, simple, self-contained proof that this estimator can be computed in linear time and has relative variance O(n^2). Combining these two facts with a standard sparsification argument yields an O(n^3 log n)-time algorithm for estimating the (un)reliability of a network. We also show how the technique can be used to create unbiased samples of disconnected networks.

Janury 12, 2017

Gates 463A, 4:15 PM

Vitaly Feldman(IBM Almaden)

Lower bounds against convex relaxations via the statistical query complexity

In this talk I'll show how algorithms for convex optimization can be used to derive lower bounds against general convex relaxations for constraint satisfaction problems. This technique relies on several recent advances in understanding of statistical query* (SQ) complexity:
1. A notion of SQ dimension of a problem that lower bounds the SQ complexity
2. Lower bounds on the SQ dimension of constraint satisfaction problems
3. SQ algorithms for stochastic convex optimization.
I'll overview these results and give some of their additional applications.

* Statistical query algorithms (Kearns, 1993) are algorithms that instead of random samples from an input distribution D over a domain X, have access to a SQ oracle for D. Given a real-valued query function g over X, the oracle returns an estimate of the expectation of g on a sample chosen randomly from D.

Based on joint works with C. Guzman, W. Perkins and S. Vempala.

Decemeber 8, 2016

Gates 463A, 4:15 PM

Yin Tat Lee (Microsoft Research Redmond)

Geodesic Walks

We introduce the geodesic walk for sampling Riemannian manifolds and apply it to the problem of generating uniform random points from polytopes in R^n specified by m inequalities. The walk is a discrete-time simulation of a stochastic differential equation (SDE) on the Riemannian manifold. The resulting sampling algorithm for polytopes mixes in O*(mn^{3/4}) steps. This is the first walk that breaks the quadratic barrier for mixing in high dimension, improving on the previous best bound of O*(mn) by Kannan and Narayanan for the Dikin walk. We also show that each step of the geodesic walk (solving an ODE) can be implemented efficiently, thus improving the time complexity for sampling polytopes.

Decemeber 1, 2016

Gates 463A, 4:15 PM

Ilias Diakonikolas (USC)

Computational Efficiency and Robust Statistics

Abstract: We consider the following basic problem: Given corrupted samples from a high-dimensional Gaussian, can we efficiently learn its parameters? This is theprototypical question in robust statistics, a field that took shape in the 1960's with the pioneering works of Tukey and Huber. Unfortunately, all known robust estimators are hard to compute in high dimensions. This prompts the following question: Can we reconcile robustness and computational efficiency in high-dimensional learning?

In this work, we give the first efficient algorithms for robustly learning a high-dimensional Gaussian that are able to tolerate a constant fraction of corruptions. Our techniques also yield robust estimators for several other high-dimensional models, including Bayesian networks, and various mixture models.

The talk will be based on joint works with (different subsets of) G. Kamath, D. Kane, J. Li, A. Moitra, and A. Stewart.

November 30, 2016

Gates 463A, 4:15 PM

Monika Henzinger (University of Vienna)

Local Flow Partitioning for Faster Edge Connectivity or Flow beats PageRank

We present a new deterministic algorithm for computing the unweighted minimum cut, aka edge connectivity, of an undirected graph with n nodes and m edges in time O(m log^2 n log log ^2 n). It improves the previous break-through result by Kawarabayashi and Thorup, which uses a PageRank-based subroutine and takes time O(m log^12 n). To achieve the running time improvement we replace the PageRank-based computation by a flow-based computation.

Note that our algorithm is also faster than the fastest randomized algorithm for this problem, which is Karger's 1996 O(m log^3 n)-time algorithm.

This is joint work with Satish Rao and Di Wang and will appear at SODA 2017.

November 17, 2016

Gates 463A, 4:15 PM

Benjamin Moseley (Washington University, St. Louis)

Algorithmic Methods for Massively Parallel Data Science

This talk is concerned with designing algorithms for large scale data science using massively parallel computation. The talk will discuss theoretical models and algorithms for massively parallel frameworks such as MapReduce and Spark. The constraints of the models are well connected to practice, but pose several algorithmic challenges. This talk introduces recent developments that overcome these challenges, widely applicable massively parallel algorithmic techniques and key questions on the theoretical foundations of massively parallel computation. The methods introduced will be applied to large data problems that are central to theoretical computer science and machine learning, clustering and submodular function optimization.

The work in this talk has been supported by Google, Yahoo and the NSF.

November 10, 2016

Gates 463A, 4:15 PM

Sam Hopkins (Cornell)

A Nearly-Tight Sum-of-Squares Lower Bound for the Planted Clique Problem

We prove that with high probability over choice of a random graph G from G(n,1/2), the n^O(d)-time degree-d sum-of-squares semidefinite programming relaxation for the clique problem will give a value of at least n^{1/2 - c(d/log(n))^(1/2) } for some c > 0, yielding a nearly tight n^(1/2 - o(1)) bound on the value of this program for any d = o(log n).

Moreover, we introduce a new framework that we call pseudocalibration to construct sum-of-squares lower bounds. It yields a general recipe for constructing good pseudo-distributions (i.e. dual certificates for the sum-of-squares semidefinite program) and sheds further light on the ways in which this hierarchy differs from others.

Joint work with Boaz Barak, Jon Kelner, Pravesh Kothari, Ankur Moitra, and Aaron Potechin.

November 3, 2016

Gates 463A, 4:15 PM

Gil Cohen (Princeton)

Recent advances in randomness extractors and their applications

We survey recent developments in randomness extractors and their applications to classical problems such as Ramsey graphs constructions and privacy amplification protocols. This exciting progress heavily relies on two new pseudo-random primitives we call correlation breakers and independence-preserving mergers, which we discuss.

October 27, 2016

Gates 415, 4:15 PM

Vasilis Syrgkanis (Microsoft Research, New England)

Computationally Efficient Learning in Auctions

A line of recent work provides welfare guarantees of simple combinatorial auction formats, such as selling m items via simultaneous second price auctions. These guarantees hold even when the auctions are repeatedly executed and players use no-regret learning algorithms. Unfortunately, off-the-shelf no-regret algorithms for these auctions are computationally inefficient as the number of actions is exponential. We show that this obstacle is insurmountable: there are no polynomial-time no-regret algorithms for simultaneous second-price auctions, even when the bidders are unit-demand. Our lower bound raises the question of how good outcomes polynomially-bounded bidders may discover in such auctions. To answer this question, we propose a novel concept of learning in auctions, termed "no-envy learning." This notion is founded upon Walrasian equilibrium, and we show that it is both efficiently implementable and results in approximately optimal welfare, even when the bidders have fractionally subadditive (XOS) valuations (assuming demand oracles) or coverage valuations (without demand oracles). Our results for XOS valuations are enabled by a novel Follow-The-Perturbed-Leader algorithm for settings where the number of experts is infinite, and the payoff function of the learner is non-linear. Our result for coverage valuations is based on a novel use of convex rounding schemes and a reduction to online convex optimization.

Joint work with Constantinos Daskalakis Link to paper: https://arxiv.org/abs/1511.01411

October 20, 2016

Gates 415, 4:15 PM

Nir Ailon (Technion-Israel Institute of Technology)

Fourier Transform Computational Lower Bounds

The Fourier transform is one of the most important linear operators in engineering. Computing it for a given input vector x in n-space can be done in time Theta(n log n) using Cooley and Tukey's FFT (Fast Fourier Transform). A lower bound better than trivial \Omega(n) has been evasive. In this talk I will show the surprising connection between speeding up FFT and uncertainty: If you could (theoretically) speed up FFT without changing the word size on the machine, then you would create uncertainty in the data in the form of numerical overflow or underflow. Alas, increasing the machine word size costs more time per multiplication/addition operation.

I will explain and prove this principle, and present two very recent related bounds together with a main conjecture and some related open problems.

The talk requires standard linear algebra and very basic familiaritywith Shannon entropy.

October 13, 2016

Gates 415, 4:15 PM

Mary Wooters (Stanford)

Repairing Reed-Solomon Codes

Reed-Solomon (RS) codes are often used in distributed storage. However, recently it's been observed that the traditional recovery algorithms for RS codes are substantially sub-optimal in this setting, and the quickly-developing field of regenerating codes has provided some much better solutions.

In this work, we show that, in fact, RS codes are much better for distributed storage than you might think! Our main result is that, in some parameter regimes, RS codes themselves are optimal regenerating codes, among MDS codes with linear repair schemes. Moreover, we give a characterization of MDS codes with good linear repair schemes which holds in any parameter regime, and which can be used to give non-trivial repair schemes for RS codes in other settings.

Along the way, we'll develop a surprising fact about low-degree polynomials (at least, surprising to me). A fundamental fact about polynomial interpolation is that k evaluations determine a degree k-1 polynomial. This is necessary in a strong sense: if we have only k-1 evaluations, we don't learn anything at all about the evaluation on a k'th point. However, our repair scheme for RS codes shows that if we are allowed to consider partial evaluations (in the sense that we only get a few bits of information about the evaluation), then in fact we can do better (in the sense that we can use fewer bits total to determine an unknown evaluation).

This is joint work with Venkat Guruswami.

October 6, 2016

Gates 463A, 4:15 PM

Aaron Sidford (Stanford)

Faster Algorithms for Computing the Stationary Distribution, Simulating Random Walks, and More

In this talk, I will present the first algorithms for solving a wide range of fundamental problems on directed graphs in less time than is required to solve a general system of linear equations. In particular I will discuss computing the stationary distribution of a directed random walk, computing Personalized PageRank with a polylogarithmic dependence on error and restart probability, and computing many of the basic properties of a non-reversible Markov chain, including the hitting times, escape probabilities, and all-pairs round-trip commute times.

To achieve these improved running times I will provide an improved algorithm for computing solutions to directed Laplacian systems, a natural generalization of symmetric or undirected Laplacian systems. In particular, I will discuss how to solve a directed Laplacian systems associated with a directed graph with "m" edges and "n" vertices in time \tilde{O}(m^{3/4}n+mn^{2/3}), where the \tilde{O} notation suppresses polylogarithmic factors n, m, the accuracy desired, and the appropriate condition number of the problem. For sparse graphs with polynomial condition numbers, this gives a running time of \tilde{O}(n^{7/4}), which breaks the O(n^{2}) barrier that would be the best that one could hope to achieve using fast matrix multiplication. Along the way, I will introduce some general techniques for reasoning about directed Laplacian systems that may be of independent interest.

This talk reflects joint work with Michael B. Cohen, Jonathan Kelner, John Peebles, Richard Peng, and Adrian Vladu.