In 2008, Zeev Dvir achieved a breakthrough in combinatorial geometry by giving a stunningly simple, and sharp, bound on the size of "Kakeya Sets" in F_q^n, the n-dimensional vector space over the finite field on q elements. (A Kakeya set in any vector space is a set that contains a line in every direction.) Dvir proved this bound by setting up an n-variate low-degree non-zero polynomial that vanished on every point in the set, and then used the algebraic nature of a Kakeya set to argue that this polynomial was zero too often if the set was too small. In addition to resolving a long-studied problem in combinatorial geometry, this method also led to new constructions of ``randomness extractors''.
In this talk I will describe algebraic methods to improve the analysis of Dvir, by using polynomials that vanish with ``high multiplicity'' on every point on the given set. This method, based on prior work with Guruswami (1998), ends up yielding extremely tight (to within a factor of 2) bounds on the size of Kakeya sets; and, in combination with a host of other techniques, state-of-the-art "extractors" (algorithms that purify randomness).
In this talk I will describe the (simple) idea behind the method of multiplicities and some of the applications.
Based on joint works with Shubhangi Saraf (Analysis& PDE, 2009); and with Zeev Dvir, Swastik Kopparty, and Shubhangi Saraf (FOCS 2009).
The parallel repetition theorem states that for any two-prover game with value smaller than 1, parallel repetition reduces the value of the game in an exponential rate.
I will give a short introduction to the problem of parallel repetition of two-prover games and some of its applications in theoretical computer science, mathematics and physics. I will concentrate mainly on recent results.
In a two-prover (alternatively, two-player) game, a referee chooses questions $(x,y)$ according to a (publicly known) distribution, and sends $x$ to the first player and $y$ to the second player. The first player responds by $a=a(x)$ and the second by $b=b(y)$ (without communicating with each other). The players jointly win if a (publicly known) predicate $V(x,y,a,b)$ holds. The value of the game is the maximal probability of success that the players can achieve, where the maximum is taken over all protocols $a=a(x),b=b(y)$.
A parallel repetition of a two-prover game is a game where the players try to win $n$ copies of the original game simultaneously. More precisely, the referee generates questions $x=(x_1,...,x_n), y=(y_1,...,y_n)$, where each pair $(x_i,y_i)$ is chosen independently according to the original distribution. The players respond by $a=(a_1,...,a_n)$ and $b=(b_1,...,b_n)$. The players win if they win simultaneously on all the coordinates, that is, if for every $i$, $V(x_i,y_i,a_i,b_i)$ holds.
The parallel repetition theorem states that for any two-prover game, with value $1-\epsilon$ (for, say, $\epsilon < 1/2$), the value of the game repeated in parallel $n$ times is at most $(1- \epsilon3)^{\Omega(n/s)}$, where $s$ is the answers' length (of the original game).
We present a near-linear time algorithm that approximates the edit distance between two strings within a significantly better factor than previously known. This result emerges in our investigation of edit distance from a new perspective, namely a model of asymmetric queries, for which we present near-tight bounds.
Another consequence of this new model is the first rigorous separation between edit distance and Ulam distance, by tracing the hardness of edit distance to phenomena that were not used by previous analyses.
[Joint work with Alexandr Andoni and Krzysztof Onak.]
The "Coin Problem" is the following problem: a biased coin is given. The coin lands on heads with probability either 1/2+beta or 1/2-beta. We are given the outcome of n independent tosses of this coin, and the goal is to guess which way the coin is biased, and to answer correctly with probability >= 2/3. When our computational model is unrestricted, the majority function is optimal, and succeeds when beta >= c / sqrt(n) for a large enough constant c. The coin problem is interesting in models that cannot compute the majority function, such as AC0, low-degree polynomials, automata, and low-width read-once branching programs.
In this talk I will show recent results with Joshua Brody, on the coin problem in the model of width-w read-once branching programs. We prove that if a branching program solves the coin problem with bias beta, then beta must be at least 1/ (log n)^{Theta(w)}. This is nearly tight, as can be seen by considering the recursive tribes function. We also briefly discuss various generalizations and variants of this result. The proof is based on appropriate use of techniques from work on AC0 circuits (mainly random restriction), along with some new ideas.
We suggest one application for this kind of theorem: We show that Nisan's generator epsilon-fools width-w read-once *regular* branching programs, using seed length O(logn*loglogn) when epsilon and w are both constant.
We are looking for applications of the coin problem in other domains (e.g. streaming lower bounds).
This talk is based on work with Joshua Brody, presented in FOCS 2010 (i.e. more or less right now). The paper is available at: http://www.cs.dartmouth.edu/~jbrody/papers/TheCoinProblemFOCS.pdf
We close the problem of understanding the space complexity of pth moment estimation in data streams for 0 < p <= 2 by giving the first optimal upper and lower bounds. In this problem, a high-dimensional vector receives a long sequence of coordinate-wise updates, and we must output a low relative-error approximation of the pth moment at the end of the stream while keeping a memory footprint exponentially smaller than the vector length. This problem has applications in network traffic monitoring, databases, and simply computing distance measures across massive datasets.
We obtain the upper bound by showing an invariance principle for sums of boundedly independent p-stable random variables, which may be of independent interest. Our main ingredient in this proof is the introduction of an approximation theoretic tool we dub "FT-mollification", which has since found other applications in agnostic learning, pseudorandomness, and approximation theory. We obtain the lower bound by showing a direct sum property for the one way communication complexity of the GapHamming problem.
Joint work with Daniel M. Kane (Harvard) and David P. Woodruff (IBM Almaden)
Assuming a Bayesian is the traditional way to model uncertainty in settings of incomplete information. This assumption, however, is very strong, and may not hold in many real applications. We thus put forward (1) a set-theoretic way to model the knowledge that a player might have about his opponents, together with (2) a new class of mechanisms -not equilibrium-based- capable of leveraging such more conservative knowledge in a robust way.
In auctions of a single good, we show that one such mechanism can perfectly guarantee a revenue benchmark (always in between the second highest and the highest valuation) that no classical mechanism can even approximate in any robust way.
Joint Work with Jing Chen
We were led to this result through our study of methods to reduce the computational complexity of cryptographic primitives and protocols. Current constructions of cryptographic primitives typically involve a large multiplicative computational overhead that grows with the desired level of security. We explore the possibility of implementing cryptographic primitives (such as encryption, authentication, signatures, or secure two-party computation) while incurring only a *constant* computational overhead compared to insecure implementations of the same tasks.
This is joint work with Yuval Ishai, Eyal Kushilevitz, and Rafail Ostrovsky.
The parallel repetition theorem states that for any Two Prover Game with value at most 1-\eps (for \eps<1/2), the value of the game repeated n times in parallel is at most (1-\eps^3)^{\Omega(n/s)}, where s is the length of the answers of the two provers \cite{R,Hol}. For Projection Games, the bound on the value of the game repeated n times in parallel was improved to (1-\eps^2)^{\Omega(n)} \cite{Rao} and this bound was shown to be tight \cite{R08}.
In this paper we study the case where the underlying distribution, according to which the questions for the two provers are generated, is uniform over the edges of a (bipartite) expander graph.
We show that if \lambda is the (normalized) spectral gap of the underlying graph, the value of the repeated game is at most (1-\eps^2)^{\Omega(c(\lambda) \cdot n/ s)}, where c(\lambda) = \poly(\lambda); and if in addition the game is a projection game, we obtain a bound of (1-\eps)^{\Omega(c(\lambda) \cdot n)}, where c(\lambda) = \poly(\lambda), that is, a strong parallel repetition theorem (when \lambda is constant).
This gives a strong parallel repetition theorem for a large class of two prover games.
This is a joint work with Ran Raz.
Global rigidity of nice cuts in graphs is a well-studied phenomenon in complexity theory and hardness of approximation. For instance, consider the results of Kahn-Kalai-Linial and Friedgut which state that balanced cuts in the hypercube that don't cut too many edges must be close to halfspaces. Starting with work of Cheeger and Kleiner on the Heisenberg group, it became clear that _local_ rigidity of cuts plays a fundamental role in understanding low-distortion embeddings into L_1. In joint work with Naor, we showed that this has implications for the approximability of the general Sparsest Cut problem in graphs.
I will talk about a discrete version of this phenomenon, and how it can be applied to understand a number of questions in metric embeddings and semi-definite programming. In particular, in joint work with Moharrami, we show a large gap between the "weak" and "strong" SDP triangle inequalities, resolving a question that has been open since 2003.
Using a more sophisticated version of this technique, I will analyze a doubling metric space whose n-point subspaces require distortion sqrt{log n / log log n} to embed into L_1, nearly matching the upper bound of Gupta-Krauthgamer-Lee (2003). This work, joint with Sidiropoulos, yields a nearly-tight "weak" integrality gap for the Sparsest Cut SDP. I will end by discussing how this weak gap might be amplified to fully resolve the negative-type embedding conjecture of Goemans and Linial.
The talk will be elementary in nature; in particular, most of the technical material will involve the structure of sets in the Euclidean plane, and discrete/randomized/ approximate versions of classical results from integral geometry.
Grouping a set of items into clusters according to their similarities is called clustering. Clustering is now a common technique in widely different applications in, e.g., bioinformatics, data mining, machine learning, and social network analysis. Considerable effort has been put into study of the clustering techniques in recent years. We introduce a new clustering paradigm, in which items to be classified are vertices of a graph. Vertices have their own budgets and the goal is to cluster them such that the cost of (connections in) each cluster can be payed by the budget of its participants. Furthermore, we want vertices in different clusters to be in some sense far from each other. We propose and analyze algorithms for two similar problems in this paradigm. The resulting guarantees lead to resolution of several network design questions. We improve the approximation ratio for prize-collecting Steiner tree and prize-collecting TSP. In addition, we present polynomial-time approximation schemes (PTAS's) for planar Steiner forest, planar multiway cut, and planar prize-collecting Steiner tree. The talk is based on joint works with Aaron Archer, MohammadTaghi Hajiaghayi, Howard Karloff, Philip Klein, Daniel Marx, and Claire Mathieu.
In this talk, I will present a general framework for constructing cut sparsifiers in undirected graphs, and use it to simplify, unify and improve upon previous sparsification results. As simple instantiations of this framework, we show that sparsifiers can be constructed by sampling edges according to their strength (a result of Bencz\'ur and Karger), effective resistance (a result of Spielman and Srivastava), edge connectivity (this resolves the main open question posed by Bencz\'ur and Karger) and Nagamochi-Ibaraki indices (this yields an elementary sparsification algorithm). In addition, we develop techniques that give the first (strictly) linear-time sparsification algorithm for unweighted graphs. Our algorithm produces sparsifiers containing O(n log n) edges in expectation; the only known construction of sparsifiers with fewer edges is by a substantially slower algorithm running in O(n^3 m) time. A key component of our results that might be of independent interest is a natural generalization of a seminal cut counting theorem due to Karger.
This work was done in collaboration with Ramesh Hariharan. Some of these results were also obtained by Fung and Harvey in independent concurrent work.
Computing driving directions has motivated many shortest path heuristics that answer queries on continental-scale networks, with tens of millions of intersections, in real time. The underlying algorithms are intuitive, but until recently there was no theoretical explanation of their performance. We review some of these algorithms and give the first theoretical analysis for them on a non-trivial class of networks.
For our analysis, we introduce the notion of highway dimension. The analysis works for networks with low highway dimension and gives a unified explanation of algorithms' good performance. Our analysis explores an unexpected relationship to VC-dimension and applies boosting to graph algorithm design.
We also show that a labeling algorithm, previously studied by the distributed computation community, achieves a better theoretical time bound. This motivates a heuristic implementation of the labeling algorithm. Our experimenters show that the implementation outperforms the fastest previous codes, validating the theoretical prediction.
Joint work with Ittai Abraham, Amos Fiat, and Renato Werneck.