Stanford Theory Seminar
Stanford Theory Seminar (formerly known as Stanford
Algorithms Seminar/AFLB) is usually held in Gates 498 or Theory Lounge(Gates
463A)
Faculty Contact: Gregory Valiant
Student Contact: Weihao Kong
Click here for directions.
June 1, 2017 4:15pm
Encina Hall, 616 Serra Mall, Stanford 94305 Bechtel International Conference Center
Piotr Indyk (MIT)
Beyond P vs. NP: Quadratic-Time Hardness For Big Data Problems
The theory of NP-hardness has been very successful in identifying problems that are unlikely to be solvable in polynomial time. However, many other important problems do have polynomial time algorithms, but large exponents in their time bounds can make them run for days, weeks or more. For example, quadratic time algorithms, although practical on moderately sized inputs, can become inefficient on big data problems that involve gigabytes or more of data. Although for many problems no sub-quadratic time algorithms are known, any evidence of quadratic-time hardness has remained elusive.
In this talk I will give an overview of recent research that aims to remedy this situation. In particular, I will describe hardness results for problems in string processing (e.g., edit distance computation or regular expression matching) and machine learning (e.g., Support Vector Machines or gradient computation in neural networks). All of them have polynomial time algorithms, but despite extensive amount of research, no near-linear time algorithms have been found. I will show that, under a natural complexity-theoretic conjecture, such algorithms do not exist. I will also describe how this framework has led to the development of new algorithms for some variants of these problems.
May 25, 2017
Gates 463A, 4:15PM
Kent Quanrud (UIUC)
Near-Linear Time Approximation Schemes for some Implicit Fractional Packing Problems
We consider several implicit fractional packing problems and
obtain faster implementations of approximation schemes by a general
framework based on multiplicative-weight updates. This leads to new
algorithms with near-linear running times for some fundamental
problems in optimization. We highlight three concrete applications.
The first is to find the maximum fractional packing of spanning trees
in a capacitated graph; we obtain a $(1 - \epsilon)$-approximation in
$\tilde{O}(m / \epsilon^2)$ time, where $m$ is the number of edges in
the graph. Second, we consider the LP relaxation of the weighted
unsplittable flow problem on a path and obtain a $(1-
\epsilon)$-approximation in $\tilde{O}(n/\epsilon^2)$ time, where $n$
is the number of demands. Third, given an undirected edge weighted
graph on $m$ edges, we obtain a $(1+\epsilon)$-approximation to the
Held-Karp bound for the Metric-TSP instance induced by the shortest
path metric in the graph in $\tilde{O}(m/\epsilon^2)$ time. Each of
these algorithms give an order of magnitude improvement over previous
results.
This is joint work with Chandra Chekuri.
May 24, 2017
Gates 463A, 4:15PM
Pravesh Kothari (Princeton)
Quantum Entanglement, Sum-of-Squares, and the Log-Rank Conjecture
In this talk, I will show a $\exp(\sqrt(n) \mathsf{poly} \log{n})$-time algorithm for the Best Separable State (BSS) problem (see next paragraph for a classical algorithmic formulation): given an $n^2 \times n^2$ quantum measurement matrix $M$, distinguish between the cases that there is a separable (i.e., non-entangled) state $\rho$ that M accepts with probability $1$, and the case that every separable state is accepted with probability at most 0.99. Aside from being a central question in quantum information theory arising in the study of entanglement, recent works have uncovered potentially useful connections between BSS and fundamental problems in classical algorithm design such as Max-Cut, Small-Set-Expansion, and Unique Games.
Equivalently, the algorithm takes input a subspace $W \subseteq \mathbb{R}^{n^2}$ and distinguishes between the case that $W$ contains a rank one matrix and the case that every rank one matrix is at least $\epsilon$ far (in the Euclidean distance) from $W$.
At the heart of our result is a general rounding paradigm that uses polynomial reweightings to round the solution to the Sum-of-Squares (SoS) semidefinite programming relaxations. Somewhat surprisingly, the construction of such a polynomial reweighting scheme uses an argument inspired by the recent breakthrough on log-rank conjecture by Lovett (STOC’14, JACM’16) who showed that the communication complexity of every rank-n Boolean matrix is bounded by $\sqrt{n} poly log(n)$.
I will assume no specialized background in the talk.
Based on joint work with Boaz Barak (Harvard) and David Steurer (Cornell, IAS).
May 18, 2017
Gates 463A, 4:15PM
Nika Haghtalab (CMU)
Oracle-efficient Online Learning and Applications to Auction Design
We consider the fundamental problem of learning from expert advice,
a.k.a online no-regret learning, where we have access to an offline
optimization oracle that can be used to compute, in constant time, the
best performing expert at any point in time. We consider the design of
no-regret algorithms that are computationally efficient using such an
oracle. We present structural properties under which we show
oracle-efficient no-regret algorithms exist, even when the set of
experts is exponentially large in a natural representation of the
problem. Our algorithm is a generalization of the
Follow-The-Perturbed-Leader algorithm of Kalai and Vempala that at
every step follows the best-performing expert subject to some
perturbations. Our design uses a shared source of randomness across
all experts that can be efficiently implemented by using an oracle on
a random modification of the history of the play at every time step.
Our second main contribution is showing that the structural properties
required for our oracle-efficient online algorithm are present in a
large class problems. As an example, we discuss applications of our
oracle-efficient learning results to the adaptive optimization of a
large class of auctions, including (1) VCG auctions with
bidder-specific reserves in single-parameter settings, (2) envy-free
item pricing in multi-item auctions, and (3) Myerson-like auctions for
single-item settings.
Bio: Nika Haghtalab is a Ph.D. student at the Computer Science
department of Carnegie Mellon University, co-advised by Avrim Blum and
Ariel Procaccia. Her research interests include machine learning
theory, computational aspects of economics, and algorithms. Nika is a
recipient of the IBM and Microsoft Research Ph.D. fellowships.
May 18, 2017
Gates 463A, 3PM
Sam Hopkins (Cornell)
Sample-optimal inference, the method of moments, and community detection
We propose a simple and efficient meta-algorithm for Bayesian estimation problems (i.e. hidden variable, latent variable, or planted problems). Our algorithm uses low-degree polynomials together with new and highly robust tensor decomposition methods. We focus on the question: for a given estimation problem, precisely how many samples (up to low-order additive terms) do polynomial-time algorithms require to obtain good estimates of hidden variables? Our meta-algorithm is broadly applicable, and achieves statistical or conjectured computational sample-complexity thresholds for many well-studied problems, including many for which previous algorithms were highly problem-specific.
As a running example we employ the stochastic block model -- a widely studied family of random graph models which contain latent community structure. We recover and unify the proofs of the best-known sample complexity bounds for the partial recovery problem in this model. We also give the first provable guarantees for partial recovery of community structure in constant-degree graphs where nodes may participate in many communities simultaneously. This model is known to exhibit a sharp sample complexity threshold -- with fewer than a very specific number of samples, recovering community structure becomes impossible. While previous explanations for this phenomenon appeal to sophisticated ideas from statistical mechanics, we give a new and simple explanation based on properties of low-degree polynomials.
Joint work with David Steurer.
May 11, 2017
Gates 463A, 4:15PM
Dan Alistarh (ETH Zurich & IST Austria)
Data Structures of the Future: Concurrent, Optimistic, and Relaxed
A central computing trend over the last decade has been the need to
process increasingly larger amounts of data as efficiently as possible. This
development is challenging both software and hardware design, and is altering
the way data structures are constructed, implemented, and
deployed.
In this talk, I will present examples of such new data structure design ideas
and implementations. In particular, I will discuss some inherent limitations of
parallelizing classic data structures, and then focus on approaches to
circumvent these limitations. The first approach is to relax the software
semantics, to allow for approximation, randomization, or both. The second is to
modify the underlying hardware architecture to unlock more parallelism. Time
permitting, I will also cover results showing that both approaches can improve
real-world performance, and touch upon some of the major open questions in the
area.
Short bio: Dan Alistarh is an Assistant Professor at IST Austria, currently
visiting ETH Zurich on an SNF Ambizione Fellowship. Previously, he was a
Researcher at Microsoft Research, Cambridge, UK, and a Postdoctoral Associate at
MIT CSAIL. He received his PhD from the EPFL, under the guidance of Prof. Rachid
Guerraoui. His research focuses on distributed algorithms and concurrent data
structures, and spans from algorithms and lower bounds to practical
implementations.
March 2, 2017
Gates 463A, 4:15 PM
Shachar Lovett (UCSD)
A robust analogue of the sensitivity conjecture
The sensitivity conjecture is a famous open problem in the theory of
boolean functions. Let f be a boolean function defined on the
hypercube. The sensitivity of a node x is the number of its neighbours
in the hypercube, for which f give the opposite value as that it does
to x. The sensitivity conjecture speculates that if all nodes have low
sensitivity, then the function f must be simple. Concretely, all its
Fourier mass is supported on levels with low hamming weight.
Gopalan et al [CCC 2016] conjectured a robust analogue of the
sensitivity conjecture: if most of the nodes have low sensitivity,
then most of the Fourier mass is supported on levels with low hamming
weight. They proved it under the stronger assumption, that all nodes
have low sensitivity. In this work, we prove the robust sensitivity
conjecture with near optimal quantitative bounds. This provides new
insights on the relation between smoothness and the Fourier structure
of boolean functions.
The talk does not assume any previous knowledge.
Joint work with Avishay Tal (IAS) and Jiapeng Zhang (UCSD).
February 9, 2017
Gates 463A, 4:15 PM
Fan Wei (Stanford)
Polynomial complexity for local MAX-CUT algorithm
In 1988, Johnson, Papadimitriou and Yannakakis wrote that
``Practically all the empirical evidence would lead us to conclude
that finding locally optimal solutions is much easier than solving
NP-hard problems". Since then the empirical evidence has continued to
amass but formal proofs of this phenomenon have remained elusive. A
canonical (and indeed complete) example is the local max-cut problem
for which no polynomial time method is known. In a breakthrough paper,
Etscheid and R{\"o}glin proved that the {\em smoothed} complexity of
local max-cut is quasi-polynomial. In this paper we prove smoothed
polynomial complexity for local max-cut, thus confirming that finding
locally optimal solutions for max-cut is much easier than solving it.
This is a joint work with Omer Angel, Sebastien Bubeck, and Yuval Peres.
February 2, 2017
Gates 463A, 4:15 PM
Andrej Risteski (Princeton)
How to calculate partition functions using convex programming hierarchies: provable bounds for variational methods
The problem of approximating partition functions for graphical models is a very fundamental one: from a practical point of view it is linked intimately to
common tasks like calculating marginals in the model and sampling from the model; from a theoretical it is linked to understanding the class #P, which intuitively
captures counting problems.
We give provable bounds for algorithms that are based on variational methods: a technique which is very popular in practice but rather poorly understood theoretically in comparison to MCMC methods.
We make use of recent tools in combinatorial optimization: the Sherali-Adams and Lasserre convex programming hierarchies to get algorithms
for "dense" Ising models. These techniques give new, non-trivial approximation guarantees for the partition function beyond the regime of correlation decay.
They also generalize some classical results from statistical physics about the Curie-Weiss ferromagnetic Ising model, as well as provide a partition function counterpart of classical results about max-cut on dense graphs. With this, we connect techniques from two apparently disparate research areas -- optimization and counting/partition function approximations.
Time permitting, we also will show worst-case guarantees for coarse (multiplicative) approximations to the log-partition functions in the worst-case setting (i.e. for arbitrary Ising models).
Janury 26, 2017
Gates 463A, 4:15 PM
Eric Price (UT Austin)
Fourier Sparsity, Polynomials, and Fast Interpolation
In recent years, a number of works have studied methods for computing
the Fourier transform in sublinear time if the output is sparse. Most
of these have focused on the discrete setting, even though in many
applications the input signal is continuous and naive discretization
significantly worsens the sparsity level.
In the continuous setting, if the signal is sampled over a duration T,
it is impossible to robustly identify frequencies that lie too close
to each other (within 1/T). We show that one can achieve this bound
to within a log factor while maintaining constant approximation factor
and good sample complexity. We furthermore show that the barrier does
not apply to estimating the signal as a whole: we give a
sample-efficient algorithm to robustly estimate Fourier-sparse
signals, regardless of the frequency gap.
As a special case, we get an algorithm to interpolate degree d
polynomials from noisy measurements, using O(d) samples and O~(d)
time, while increasing the error by a constant factor in L2.
Joint work with Xue Chen, Daniel Kane, and Zhao Song. Based on
https://arxiv.org/abs/1609.00896 and https://arxiv.org/abs/1609.01361 .
Janury 19, 2017
Gates 463A, 4:15 PM
David Karger (MIT)
A Fast and Simple Unbiased Estimator for Network (Un)reliability
The following procedure yields an unbiased estimator for the
disconnection probability of an nn-vertex graph with minimum cut c if
every edge fails independently with probability p: (i) contract every edge independently with probability 1−n^{−2/c}, then (ii) recursively
compute the disconnection probability of the resulting tiny graph if
each edge fails with probability pn^{2/c}. We give a short, simple,
self-contained proof that this estimator can be computed in linear
time and has relative variance O(n^2). Combining these two facts with
a standard sparsification argument yields an O(n^3 log n)-time
algorithm for estimating the (un)reliability of a network. We also
show how the technique can be used to create unbiased samples of
disconnected networks.
Janury 12, 2017
Gates 463A, 4:15 PM
Vitaly Feldman(IBM Almaden)
Lower bounds against convex relaxations via the statistical
query complexity
In this talk I'll show how algorithms for convex optimization can be
used to derive lower bounds against general convex relaxations for
constraint satisfaction problems. This technique relies on several
recent advances in understanding of statistical query* (SQ)
complexity:
1. A notion of SQ dimension of a problem that lower bounds the SQ complexity
2. Lower bounds on the SQ dimension of constraint satisfaction problems
3. SQ algorithms for stochastic convex optimization.
I'll overview these results and give some of their additional applications.
* Statistical query algorithms (Kearns, 1993) are algorithms that
instead of random samples from an input distribution D over a domain
X, have access to a SQ oracle for D. Given a real-valued query
function g over X, the oracle returns an estimate of the expectation
of g on a sample chosen randomly from D.
Based on joint works with C. Guzman, W. Perkins and S. Vempala.
Decemeber 8, 2016
Gates 463A, 4:15 PM
Yin Tat Lee (Microsoft Research Redmond)
Geodesic Walks
We introduce the geodesic walk for sampling Riemannian manifolds and apply it to the problem of generating uniform random points from polytopes in R^n specified by m inequalities. The walk is a discrete-time simulation of a stochastic differential equation (SDE) on the Riemannian manifold. The resulting sampling algorithm for polytopes mixes in O*(mn^{3/4}) steps. This is the first walk that breaks the quadratic barrier for mixing in high dimension, improving on the previous best bound of O*(mn) by Kannan and Narayanan for the Dikin walk. We also show that each step of the geodesic walk (solving an ODE) can be implemented efficiently, thus improving the time complexity for sampling polytopes.
Decemeber 1, 2016
Gates 463A, 4:15 PM
Ilias Diakonikolas (USC)
Computational Efficiency and Robust Statistics
Abstract: We consider the following basic problem: Given corrupted
samples from a high-dimensional Gaussian, can we efficiently learn its
parameters? This is theprototypical question in robust statistics, a field that took shape in
the 1960's with the pioneering works of Tukey and Huber.
Unfortunately, all known robust estimators are hard to compute in high
dimensions. This prompts the following question: Can we reconcile
robustness and computational efficiency in high-dimensional learning?
In this work, we give the first efficient algorithms for robustly
learning a high-dimensional Gaussian that are able to tolerate a
constant fraction of corruptions. Our techniques also yield robust
estimators for several other high-dimensional models, including
Bayesian networks, and various mixture models.
The talk will be based on joint works with (different subsets of) G.
Kamath, D. Kane, J. Li, A. Moitra, and A. Stewart.
November 30, 2016
Gates 463A, 4:15 PM
Monika Henzinger (University of Vienna)
Local Flow Partitioning for Faster Edge Connectivity or Flow beats PageRank
We present a new deterministic algorithm for computing the unweighted minimum cut, aka edge connectivity, of an undirected graph with n nodes and m edges in time O(m log^2 n log log ^2 n).
It improves the previous break-through result by Kawarabayashi and Thorup, which uses a PageRank-based subroutine and takes time O(m log^12 n). To achieve the running time improvement we replace the PageRank-based computation by a flow-based computation.
Note that our algorithm is also faster than the fastest randomized algorithm for this problem, which is Karger's 1996 O(m log^3 n)-time algorithm.
This is joint work with Satish Rao and Di Wang and will appear at SODA 2017.
November 17, 2016
Gates 463A, 4:15 PM
Benjamin Moseley (Washington University, St. Louis)
Algorithmic Methods for Massively Parallel Data Science
This talk is concerned with designing algorithms for large scale data
science using massively parallel computation. The talk will discuss
theoretical models and algorithms for massively parallel frameworks
such as MapReduce and Spark. The constraints of the models are well
connected to practice, but pose several algorithmic challenges. This
talk introduces recent developments that overcome these challenges,
widely applicable massively parallel algorithmic techniques and key
questions on the theoretical foundations of massively parallel
computation. The methods introduced will be applied to large data
problems that are central to theoretical computer science and machine
learning, clustering and submodular function optimization.
The work in this talk has been supported by Google, Yahoo and the NSF.
November 10, 2016
Gates 463A, 4:15 PM
Sam Hopkins (Cornell)
A Nearly-Tight Sum-of-Squares Lower Bound for the Planted Clique Problem
We prove that with high probability over choice of a random
graph G from G(n,1/2), the n^O(d)-time degree-d sum-of-squares
semidefinite programming relaxation for the clique problem will give a
value of at least n^{1/2 - c(d/log(n))^(1/2) } for some c > 0,
yielding a nearly tight n^(1/2 - o(1)) bound on the value of this
program for any d = o(log n).
Moreover, we introduce a new framework that we call pseudocalibration
to construct sum-of-squares lower bounds. It yields a general recipe
for constructing good pseudo-distributions (i.e. dual certificates for
the sum-of-squares semidefinite program) and sheds further light on
the ways in which this hierarchy differs from others.
Joint work with Boaz Barak, Jon Kelner, Pravesh Kothari, Ankur Moitra,
and Aaron Potechin.
November 3, 2016
Gates 463A, 4:15 PM
Gil Cohen (Princeton)
Recent advances in randomness extractors and their applications
We survey recent developments in randomness extractors and their applications to classical problems such as Ramsey graphs constructions and privacy amplification protocols. This exciting progress heavily relies on two new pseudo-random primitives we call correlation breakers and independence-preserving mergers, which we discuss.
October 27, 2016
Gates 415, 4:15 PM
Vasilis Syrgkanis (Microsoft Research, New England)
Computationally Efficient Learning in Auctions
A line of recent work provides welfare guarantees of simple
combinatorial auction formats, such as selling m items via
simultaneous second price auctions. These guarantees hold even when
the auctions are repeatedly executed and players use no-regret
learning algorithms. Unfortunately, off-the-shelf no-regret algorithms
for these auctions are computationally inefficient as the number of
actions is exponential. We show that this obstacle is insurmountable:
there are no polynomial-time no-regret algorithms for simultaneous
second-price auctions, even when the bidders are unit-demand. Our
lower bound raises the question of how good outcomes
polynomially-bounded bidders may discover in such auctions. To answer
this question, we propose a novel concept of learning in auctions,
termed "no-envy learning." This notion is founded upon Walrasian
equilibrium, and we show that it is both efficiently implementable and
results in approximately optimal welfare, even when the bidders have
fractionally subadditive (XOS) valuations (assuming demand oracles) or
coverage valuations (without demand oracles). Our results for XOS
valuations are enabled by a novel Follow-The-Perturbed-Leader
algorithm for settings where the number of experts is infinite, and
the payoff function of the learner is non-linear. Our result for
coverage valuations is based on a novel use of convex rounding schemes
and a reduction to online convex optimization.
Joint work with Constantinos Daskalakis
Link to paper: https://arxiv.org/abs/1511.01411
October 20, 2016
Gates 415, 4:15 PM
Nir Ailon (Technion-Israel Institute of Technology)
Fourier Transform Computational Lower Bounds
The Fourier transform is one of the most important linear operators in engineering. Computing it for a given input vector x in n-space can be done in time Theta(n log n) using Cooley and Tukey's FFT (Fast Fourier Transform). A lower bound better than trivial \Omega(n) has been evasive. In this talk I will show the surprising connection between speeding up FFT and uncertainty: If you could (theoretically) speed up FFT without changing the word size on the machine, then you would create uncertainty in the data in the form of numerical overflow or underflow. Alas, increasing the machine word size costs more time per multiplication/addition operation.
I will explain and prove this principle, and present two very recent related bounds together with a main conjecture and some related open problems.
The talk requires standard linear algebra and very basic familiaritywith Shannon entropy.
October 13, 2016
Gates 415, 4:15 PM
Mary Wooters (Stanford)
Repairing Reed-Solomon Codes
Reed-Solomon (RS) codes are often used in distributed storage. However, recently it's been observed that the traditional recovery algorithms for RS codes are substantially sub-optimal in this setting, and the quickly-developing field of regenerating codes has provided some much better solutions.
In this work, we show that, in fact, RS codes are much better for distributed storage than you might think! Our main result is that, in some parameter regimes, RS codes themselves are optimal regenerating codes, among MDS codes with linear repair schemes. Moreover, we give a characterization of MDS codes with good linear repair schemes which holds in any parameter regime, and which can be used to give non-trivial repair schemes for RS codes in other settings.
Along the way, we'll develop a surprising fact about low-degree polynomials (at least, surprising to me). A fundamental fact about polynomial interpolation is that k evaluations determine a degree k-1 polynomial. This is necessary in a strong sense: if we have only k-1 evaluations, we don't learn anything at all about the evaluation on a k'th point. However, our repair scheme for RS codes shows that if we are allowed to consider partial evaluations (in the sense that we only get a few bits of information about the evaluation), then in fact we can do better (in the sense that we can use fewer bits total to determine an unknown evaluation).
This is joint work with Venkat Guruswami.
October 6, 2016
Gates 463A, 4:15 PM
Aaron Sidford (Stanford)
Faster Algorithms for Computing the Stationary Distribution,
Simulating Random Walks, and More
In this talk, I will present the first algorithms for
solving a wide range of fundamental problems on directed graphs in
less time than is required to solve a general system of linear
equations. In particular I will discuss computing the stationary
distribution of a directed random walk, computing Personalized
PageRank with a polylogarithmic dependence on error and restart
probability, and computing many of the basic properties of a
non-reversible Markov chain, including the hitting times, escape
probabilities, and all-pairs round-trip commute times.
To achieve these improved running times I will provide an improved
algorithm for computing solutions to directed Laplacian systems, a
natural generalization of symmetric or undirected Laplacian systems.
In particular, I will discuss how to solve a directed Laplacian
systems associated with a directed graph with "m" edges and "n"
vertices in time \tilde{O}(m^{3/4}n+mn^{2/3}), where the \tilde{O}
notation suppresses polylogarithmic factors n, m, the accuracy
desired, and the appropriate condition number of the problem. For
sparse graphs with polynomial condition numbers, this gives a running
time of \tilde{O}(n^{7/4}), which breaks the O(n^{2}) barrier that
would be the best that one could hope to achieve using fast matrix
multiplication. Along the way, I will introduce some general
techniques for reasoning about directed Laplacian systems that may be
of independent interest.
This talk reflects joint work with Michael B. Cohen, Jonathan Kelner,
John Peebles, Richard Peng, and Adrian Vladu.