Stanford Theory Seminar


Stanford Theory Seminar (formerly known as Stanford Algorithms Seminar/AFLB) is usually held in Gates 463A.
Faculty Contact: Mary Wootters
Student Contact: Ofir Geri
Click here for directions.

Friday, December 7

Gates 392, 3:00PM

Sam Hopkins (Berkeley)

Mean Estimation with Sub-Gaussian Rates in Polynomial Time

We study polynomial-time algorithms for a fundamental statistics problem: estimating the mean of a random vector from i.i.d. samples. Focusing on the heavy-tailed case, we assume only that the random vector X has finite mean and covariance. In this setting, the radius of confidence intervals achieved by the empirical mean are large compared to the case that X is Gaussian or sub-Gaussian. On the other hand, estimators based on high-dimensional medians can achieve tighter confidence intervals, at the cost of potential computational intractability.

We offer the first polynomial time algorithm to estimate the mean with sub-Gaussian-size confidence intervals under such mild assumptions. Our algorithm is based on a new semidefinite programming relaxation of a high-dimensional median. Previous estimators which assumed only existence of finitely-many moments of X either sacrifice sub-Gaussian performance or are only known to be computable via brute-force search procedures requiring time exponential in the dimension.


Wednesday, December 5

Gates 463A, 3:00PM Note the ununsual day!

Saeed Seddighin (University of Maryland)

Fast and Parallel Algorithms for Edit Distance and Longest Common Subsequence

String similarity measures are among the most fundamental problems in computer science. The notable examples are edit distance (ED) and longest common subsequence (LCS). These problems find their applications in various contexts such as computational biology, text processing, compiler optimization, data analysis, image analysis, etc. In this talk, I'll present fast and parallel algorithms for both problems. In the first part of my talk, I will present an algorithm for approximating edit distance within a constant factor in truly subquadratic time. This question has been open for 3 decades and only recently we were able to give positive answers to it.

In the second part of my talk, I will present MPC algorithms for both edit distance and longest common subsequence. These algorithms can be seen as extensions of the previous ideas to the MPC model. The algorithms are optimal with respect to round complexity, time complexity, and approximation factor.


Wednesday, November 28

Gates 463A, 3:00PM Note the ununsual day!

Urmila Mahadev (Berkeley)

Classical Verification of Quantum Computations

We present the first protocol allowing a classical computer to interactively verify the result of an efficient quantum computation. We achieve this by constructing a measurement protocol, which allows a classical string to serve as a commitment to a quantum state. The protocol forces the prover to behave as follows: the prover must construct an n qubit state of his choice, non-adaptively measure each qubit in the Hadamard or standard basis as directed by the verifier, and report the measurement results to the verifier. The soundness of this protocol is enforced based on the assumption that the learning with errors problem is computationally intractable for efficient quantum machines.


Friday, November 16

Gates 392, 3:15PM Note the different room and the slightly different time!

Jonathan Mosheiff (Weizmann)

On the weight distribution of random binary linear codes

A random (binary) linear code is a dimension lambda*n (0 < \lambda < 1) linear subspace of the binary n-dimensional hypercube, chosen uniformly from among all such subspaces. Such codes play an important role in the theory of error correcting codes, since they achieve the best known rate vs. distance trade-off, i.e., the Gilbert-Varshamov lower bound. Under a random errors regime, the problem of decoding these codes is known as Learning Parity with Noise, and has many cryptographic applications. This work is motivated by the contrast between the importance of random linear codes and how little we know about them.

Much of the interesting information about a code C is captured by its weight distribution. This is the vector (w_0,w_1,...,w_n) where w_i counts the elements of C with Hamming weight i. In this work we study the weight distribution of random linear codes. In our main result we compute the moments of the random variable w_(gamma*n), where 0 < \gamma < 1 is a fixed constant and n goes to infinity.

This is joint work with Nati Linial.


Friday, November 16

Gates 104, 2:00PM Note the different room and time!

Andrew Chi-Chih Yao, Tsinghua University

Some Perspectives on FinTech

FinTech can be seen as the meeting of minds between economics and computer science in the digital age. One of its major intellectual foundations is the interdisciplinary subject of multiparty computation, which involves reliable distributed computing and cryptography from the side of computer science, and efficient mechanism design for financial activities from the side of economics. In this talk we discuss some recent work in auction and blockchain from this perspective. For example, is it true that more revenue can always be extracted from an auction where the bidders are more willing to pay than otherwise? Can more revenue be extracted when the bidders are more risk-tolerant than otherwise? We also present some new results on blockchain fees. These results help shed light on some structural questions in economics whose answers are non-obvious.


Friday, November 9

No theory seminar today, instead...

TOCA-SV and the Motwani Colloquium! All day long! Huang Mackenzie center, room 300!

The organizing committee: Moses Charikar (Stanford), Ravi Kumar (Google), Omer Reingold (Stanford), D. Sivakumar (Google) and Gregory Valiant (Stanford) Abstracts:

Avishay Tal, Oracle Separation of BQP and the Polynomial Hierarchy

In their seminal paper, Bennett, Bernstein, Brassard and Vazirani [SICOMP, 1997] showed that relative to an oracle, quantum algorithms are unable to solve NP-complete problems in sub-exponential time (i.e., that Grover's search is optimal in this setting).

In this work, we show a strong converse to their result. Namely, we show that, relative to an oracle, there exist computational tasks that can be solved efficiently by a quantum algorithm, but require exponential time for any algorithm in the polynomial hierarchy (a hierarchy of complexity classes that captures P, NP, coNP, and their generalizations).

The tasks that exhibit this "quantum advantage" arise from a pseudo-randomness approach initiated by Aaronson [STOC, 2010]. Our core technical result is constructing a distribution over Boolean strings that "look random" to constant-depth circuits of quasi-polynomial size, but can be distinguished from the uniform distribution by very efficient quantum algorithms.

Joint work with Ran Raz

Badih Ghazi, Resource-Efficient Common Randomness and Secret Key Generation

The task of manipulating randomness has been a subject of intense investigation in computational complexity with dispersers, extractors, pseudorandom generators, condensers, mergers being just a few of the objects of interest. All these tasks consider a single processor massaging random samples from an unknown source.

In this talk, I will discuss a less studied setting where randomness is distributed among different players who would like to convert it to other forms in an efficient manner and with little communication. For instance players may be given access to a source of biased correlated bits and their goal may be to get a common random string out of this source. Even the setting where the source is known leads to some interesting questions that have been explored since the 70s with striking constructions and some surprisingly hard questions. After giving some background, I will describe recent work which explores the task of extracting common randomness from correlated sources with bounds on either the sample complexity or on the number of rounds of interaction.

Based on joint works with T.S. Jayram, Mitali Bafna, Noah Golowich and Madhu Sudan

Shafi Goldwasser, Pseudo Deterministic Algorithms and Proofs

Probabilistic algorithms for both decision and search problems can offer significant complexity improvements over deterministic algorithms. One major difference, however, is that they may output different solutions for different choices of randomness. This makes correctness amplification impossible for search algorithms and is less than desirable in setting where uniqueness of output is important such as generation of system wide cryptographic parameters or distributed setting where different sources of randomness are used. Pseudo-deterministic algorithms are a class of randomized search algorithms, which output a unique answer with high probability. Intuitively, they are indistinguishable from deterministic algorithms by an polynomial time observer of their input/output behavior. In this talk I will describe what is known about pseudo-deterministic algorithms in the sequential, sub-linear and parallel setting. We will also describe n extension of pseudo-deterministic algorithms to interactive proofs for search problems where the veri fier is guaranteed with high probability to output the same output on different executions, regardless of the prover strategies.


Friday, November 2

Gates 392, 3:00PM Note the different room!

Rotem Oshman (Tel-Aviv University)

Two Open Problems in Distributed Graph Algorithms

In a distributed graph algorithm, we have a network of computing nodes, where each node initially knows only its own local neighborhood; the nodes communicate over the network edges in order to solve some problem on the network graph. We are interested in algorithms that are fast, but also do not require a lot of communication between the network nodes.

In this talk I will describe recent algorithms and lower bounds for two graph problems: exact maximum bipartite matching, and testing whether the network contains an even-length cycle of a specific length. Both problems do not have matching upper and lower bounds, so their complexity remains open. The talk will not assume any prior knowledge about distributed computing.


Wednesday, October 31

Gates 463A, 3:00PM NOTE THE UNUSUAL DAY!

Toniann Pitassi (University of Toronto and Institute for Advanced Study)

Lifting in Communication Complexity and Beyond

Hardness escalation or query-to-communication lifting is a method of proving tight upper and lower bounds on the complexity of a composed function or relation by a reduction to the query complexity of the base function. This talk will primarily be a tutorial. We will give many applications of lifting, including new and improved circuit lower bounds, as well as lower bounds in game theory, proof complexity and extended formulation complexity. We will sketch the main ideas in the proofs of lifting for randomized communication complexity, and conclude with new directions. This is joint work with Mika Goos and Thomas Watson.


Friday, October 26

Gates 463A, 3:00PM

Erik Waingarten (Columbia)

Approximate Nearest Neighbors via Non-Linear Spectral Gaps

I will present recent advances in approximate nearest neighbor search data structures for general normed spaces. I will explain what non-linear spectral gaps are, and how to use estimates on non-linear spectral gaps to partition large graphs whose vertices lie in a normed space. As a result, I will present the first sub-linear time data structure for approximate nearest neighbor search in high-dimensional normed spaces with approximation which is sub-polynomial in the dimension.

Based on joint work with Alex Andoni, Assaf Naor, Sasho Nikolov, and Ilya Razenshteyn.


Tuesday, October 23

Gates 463A, 4:15PM Note the unusual date and time!!

Srikanth Srinivasan (IIT Bombay)

On constant-depth circuits for the coin problem

We come up with explicit functions for which we can prove tight (up to polynomial factors) upper and lower bounds in the AC^0[2] circuit model. In particular, this implies the first Fixed-Depth Size Hierarchy theorem for this model.

The explicit functions are obtained by constructing explicit AC^0[2] circuits for solving the coin problem, which is defined as follows. For a parameter delta, we have to decide whether a given coin has bias (1+delta)/2 or (1-delta)/2.

Our upper bounds are proved by derandomizing a circuit construction of O'Donnell-Wimmer (2007) and Amano (2009) to reduce the number of samples. Our lower bounds follow from a modification of the Razborov-Smolensky polynomial method.

Joint work with Nutan Limaye (IITB CSE), Karteek Sreenivasiah (IITH CSE), Utkarsh Tripathi (IITB Math) and S Venkitesh (IITB Math).


TUESDAY, October 16

Gates 463A, 3:00PM Note the unusual day!!

Elad Hazan (Princeton)

Taking Control by Convex Optimization

Linear dynamical systems, a.k.a. Kalman filtering, are a class of time-series models widely used in robotics, finance, engineering, and meteorology. In it's general form (unknown system), learning LDS is a classic non-convex problem, typically tackled with heuristics like gradient descent ("backpropagation through time") or the EM algorithm.

I will present our new "spectral filtering" approach to the identification and control of discrete-time general linear dynamical systems with multi-dimensional inputs, outputs, and a latent state. This approach yields a simple and efficient algorithms for low-regret prediction (i.e. asymptotically vanishing MSE) as well as finite-time control.


THURSDAY, October 11

Gates 463A, 4:15PM Note the unusual day and time!!

Juba Ziani (Caltech)

Optimal Data Acquisition for Statistical Estimation

We consider a data analyst's problem of purchasing data from strategic agents to compute an unbiased estimate of a statistic of interest. Agents incur private costs to reveal their data and the costs can be {\em arbitrarily correlated} with their data. Once revealed, data are verifiable. This paper focuses on {\em linear} unbiased estimators. We design an individually rational and incentive compatible mechanism that optimizes the worst-case mean-squared error of the estimation, where the worst-case is over the unknown correlation between costs and data, subject to a budget constraint in expectation. We characterize the form of the optimal mechanism in closed-form. We further extend our results to acquiring data for estimating a parameter in regression analysis, where private costs can correlate with the values of the dependent variable but not with the values of the independent variables.


MONDAY, October 8

Gates 463A, 3:00PM Note the unusual day!!

Shahar Dobzinski (Weizmann)

From Cognitive Biases to the Communication Complexity of Local Search

In this talk I will tell you how analyzing economic markets where agents have cognitive biases has lead to better understanding of the communication complexity of local search procedures.

We begin the talk with studying combinatorial auctions with bidders that exhibit endowment effect. In most of the previous work on cognitive biases in algorithmic game theory (e.g., [Kleinberg and Oren, EC'14] and its follow-ups) the focus was on analyzing the implications and mitigating their negative consequences. In contrast, we show how cognitive biases can sometimes be harnessed to improve the outcome.

Specifically, we study Walrasian equilibria in combinatorial markets. It is well known that a Walrasian equilibrium exists only in limited settings, e.g., when all valuations are gross substitutes, but fails to exist in more general settings, e.g., when the valuations are submodular. We consider combinatorial settings in which bidders exhibit the endowment effect, that is, their value for items increases with ownership. Our main result here shows that when the valuations are submodular even a mild level of endowment effect suffices to guarantee the existence of Walrasian equilibrium. In fact, we show that in contrast to Walrsian equilibria with standard utility maximizers bidders -- in which the equilibrium allocation must be a global optimum -- when bidders exhibit endowment effect any local optimum can be an equilibrium allocation.

This raises the natural question of understanding the complexity of computing a local maximum in combinatorial markets. We reduce it to the following communication variant of local search: there is some fixed, commonly known graph G. Alice holds f_A and Bob holds f_B, both are functions that specify a value for each vertex. The goal is to find a local maximum of f_A+f_B, i.e., a vertex v for which f_A(v)+f_B(v) >= f_A(u)+f_B(u) for every neighbor u of v. We prove that finding a local maximum requires polynomial (in the number of vertices) bits of communication.

Based on joint works with Moshe Babaioff, Yakov Babichenko, Noam Nisan, and Sigal Oren.


Friday, October 5

Gates 463A, 3:00PM

Madhu Sudan (Harvard)

General Strong Polarization

A martingale is a sequence of random variables that maintain their future expected value conditioned on the past. A $[0,1]$-bounded martingale is said to polarize if it converges in the limit to either $0$ or $1$ with probability $1$. A martingale is said to polarize strongly, if in $t$ steps it is sub-exponentially close to its limit with all but exponentially small probability. In 2008, Arikan built a powerful class of error-correcting codes called Polar codes. The essence of his theory associates a martingale with every invertible square matrix over a field (and a channel) and showed that polarization of the martingale leads to a construction of codes that converge to Shannon capacity. In 2013, Guruswami and Xia, and independently Hassani et al. showed that strong polarization of the Arikan martingale leads to codes that converge to Shannon capacity at finite block lengths, specifically at lengths that are inverse polynomial in the gap to capacity, thereby resolving a major mathematical challenge associated with the attainment of Shannon capacity.

We show that a simple necessary condition for an invertible matrix to polarize over any non-trivial channel is also sufficient for strong polarization over all symmetric channels over all prime fields. Previously the only matrix which was known to polarize strongly was the $2\times 2$ Hadamard matrix. In addition to the generality of our result, it also leads to arguably simpler proofs. The essence of our proof is a ``local definition'' of polarization which only restricts the evolution of the martingale in a single step, and a general theorem showing the local polarization suffices for strong polarization.

In this talk I will introduce polarization and polar codes and, time permitting, present a full proof of our main theorem. No prior background on polar codes will be assumed.

Based on joint work with Jaroslaw Blasiok, Venkatesan Guruswami, Preetum Nakkiran and Atri Rudra.


Friday, September 28

Gates 463A, 3:00PM

Barna Saha (UMass)

Language Edit Distance, (min,+)-Matrix Multiplication & Beyond

The language edit distance is a significant generalization of two basic problems in computer science: parsing and string edit distance computation. Given any context free grammar, it computes the minimum number of insertions, deletions and substitutions required to convert a given input string into a valid member of the language. In 1972, Aho and Peterson gave a dynamic programming algorithm that solves this problem in time cubic in the string length. Despite its vast number of applications, in forty years there has been no improvement over this running time.

Computing (min,+)-product of two n by n matrices in truly subcubic time is an outstanding open question, as it is equivalent to the famous All-Pairs-Shortest-Paths problem. Even when matrices have entries bounded in [1,n], obtaining a truly subcubic (min,+)-product algorithm will be a major breakthrough in computer science.

In this presentation, I will explore the connection between these two problems which led us to develop the first truly subcubic algorithms for the following problems with improvements coming for each of these problems after several decades: (1) language edit distance, (2) RNA-folding-a basic computational biology problem and a special case of language edit distance computation, (3) stochastic grammar parsing—fundamental to natural language processing, and (4) (min,+)-product of integer matrices with entries bounded in n^(3-ω-c) where c >0 is any constant and, ω is the exponent of the fast matrix multiplication, believed to be 2.