Location: Gates 4B lobby.
Mailing list: thseminar@cs.stanford.edu. Email Kshipra to sign up.
Contact: Kshipra Bhawalkar (kshipra@stanford...)
A Locally Correctable Code (LCC) is an error correcting code that has a probabilistic self-correcting algorithm that, with high probability, can correct any coordinate of the codeword by looking at only a few other coordinates, even if a fraction δ of the coordinates are corrupted. LCCs are a stronger form of LDCs (Locally Decodable Codes) which have received a lot of attention recently due to their many applications and surprising constructions.
In this work, we show a separation between 2-query LDCs and LCCs over finite fields of prime order. Specifically, we prove a lower bound of the form p^Ω(δd) on the length of linear 2-query LCCs over F_p, that encode messages of length d. Our bound improves over the known bound of 2^Ω(δd) which is tight for LDCs. Our proof makes use of tools from additive combinatorics which have played an important role in several recent results in Theoretical Computer Science.
We also obtain, as corollaries of our main theorem, new results in incidence geometry over finite fields. The first is an improvement to the Sylvester-Gallai theorem over finite fields and the second is a new analog of Beck's theorem over finite fields.
Joint work with Zeev Dvir, Shubhangi Saraf, and Amir Shpilka
I will present a paper of Russel Impagliazzo and Valentine Kabanets from RANDOM 2010, which gives a new easy proof of the Chernoff bound. In my opinion, this proof is considerably more intuitive than the classic proofs. In addition, this new proof has an application to questions on direct product theorems, which will be discussed if time allows.
One of the central questions in economics is how efficiently share resources among people. this question is especially relevant today because of its numerous applications in engineering: power engineers, for example, need to distribute electricity demand among competing power generators in way that minimizes total cost; network engineering often retuires sharing limited bandwidth between the users. Resource allocation is very much a computer science problem, because its solution involves designing algorithms that compute efficient allocations. In this talk, I will survey several important theoretical results in this area and describe some interesting extensions I have worked on.
Consider fully dynamic data, where we track data as it gets inserted and deleted. There are well developed notions of private data analyses with dynamic data, for example, using differential privacy. We want to go beyond privacy, and consider privacy together with security, formulated recently as pan-privacy by Dwork et al. (ICS 2010). Informally, pan-privacy preserves differential privacy while computing desired statistics on the data, even if the internal memory of the algorithm is compromised (say, by a malicious break-in or insider curiosity or by fiat by the government or law).
We study pan-private algorithms for basic analyses, like estimating distinct count, moments, and heavy hitter count, with fully dynamic data. We present the first known pan-private algorithms for these problems in the fully dynamic model. Our algorithms rely on sketching techniques popular in streaming: in some cases, we add suitable noise to a previously known sketch, using a novel approach of calibrating noise to the underlying problem structure and the projection matrix of the sketch; in other cases, we maintain certain statistics on sketches; in yet others, we define novel sketches. We also present the first known lower bounds explicitly for pan privacy, showing our results to be nearly optimal for these problems. Our lower bounds are stronger than those implied by differential privacy or dynamic data streaming alone and hold even if unbounded memory and/or unbounded processing time are allowed. The lower bounds use a noisy decoding argument and exploit a connection between pan-private algorithms and data sanitization.
Joint work with D. Mir, S. Muthukrishnan, and Rebecca Wright
I will be talking about "Learning Hurdles for Sleeping Experts", my paper with Varun Kanade in ITCS 2012. We study the online decision problem where the set of available actions varies over time, also called the sleeping experts problem. We consider the setting where the performance comparison is made with respect to the best ordering of actions in hindsight. In this paper, both the payoff function and the availability of actions is adversarial. Kleinberg et al. (2008) gave a computationally efficient no-regret algorithm in the setting where payoffs are stochastic. Kanade et al. (2009) gave an efficient no-regret algorithm in the setting where action availability is stochastic. However, the question of whether there exists a computationally efficient no-regret algorithm in the adversarial setting was posed as an open problem by Kleinberg et al. (2008). We show that such an algorithm would imply an algorithm for PAC learning DNF, a long standing important open problem. We also show that a related problem, the gambling problem, posed as an open problem by Abernethy (2010) is related to agnostically learning halfspaces, albeit under restricted distributions.
We study a market for private data in which a data analyst wishes to publicly release a statistic computed over a database of private information. The statistic we focus on is the inner product of the database entries with a publicly known weight vector. Individuals that own the data incur a cost for their loss of privacy quantified in terms of the differential-privacy guarantee given by the analyst at the time of the release. To properly incentivize individuals, the analyst must compensate them for the cost they incur. This gives rise to a \emph{privacy auction}, in which the analyst decides how much privacy to purchase from each individual, in order to \emph{cheaply} obtain an accurate estimate of the inner product. The individuals are profit-maximizing, so we would like to design a truthful auction.
First, we formalize the trade-off between privacy and accuracy in this setting; we show that obtaining an accurate estimate of the inner product necessitates providing poor privacy guarantees to individuals that have a significant effect on the estimate. We show that a simple, natural class of estimates achieves an order-optimal trade-off between privacy and accuracy, and, consequently, it suffices to focus on auction mechanisms that output such estimates. These estimates guarantee privacy to individuals in proportion to their effect on the accuracy of the estimate. We use this observation to design a truthful, individually rational, proportional-purchase mechanism under the constraint that the analyst has a fixed budget. We show that our mechanism is 5-approximate in terms of accuracy compared to the optimal mechanism, and that no truthful mechanism can achieve a $2-\varepsilon$ approximation, for any $\varepsilon > 0$.
Joint work with Nadia Fawaz and Stratis Ioannidis.
A preliminary version of the paper describing our results can be found at: http://arxiv.org/abs/1111.2885
A basic fact in algebraic graph theory is that the number of connected components in an undirected graph is equal to the multiplicity of the eigenvalue zero in the Laplacian matrix of the graph. In particular, the graph is disconnected if and only if there are at least two eigenvalues equal to zero. Cheeger's inequality and its variants provide an approximate version of the latter fact; they state that a graph has a sparse cut if and only if there are at least two eigenvalues that are close to zero.
In this talk I show an analogous characterization holds for higher multiplicities, i.e., there are k eigenvalues close to zero if and only if the vertex set can be partitioned into k subsets, each defining a sparse cut. Our result also provides a theoretical justification for clustering algorithms that use the bottom k eigenvectors to embed the vertices into R^k, and then apply geometric considerations to the embedding. Our techniques also yield a nearly optimal tradeoff between the expansion of sets of size~n/k, and the k-th smallest eigenvalue of the normalized Laplacian matrix.
Based on a joint work with James R. Lee, and Luca Trevisan.
The talk will cover recent work (still in progress) on lattice complexity problems that are useful for threshold signatures and more.