Location: We meet in the Gates 4B lobby, also known as the theory lounge.
Mailing List: thseminar@cs.stanford.edu. Email shaddin to sign up.
Contact: Shaddin Dughmi (shaddin at cs dot stanford dot edu)
Food: Wendy usually choses the food at her discretion. However, the speaker may suggest a preference to Wendy for the day of their talk ahead of time.
For previous quarter schedules see here.
Motivated by sponsored search auctions with hard budget constraints given by the advertisers, we study multi-unit auctions of a single item. An important example is a sponsored result slot for a keyword, with many units representing its inventory in a month, say. In this single-item multi-unit auction, each bidder has a private value for each unit, and a private budget which is the total amount of money she can spend in the auction. A recent impossibility result [Dobzinski et al., FOCS'08] precludes the existence of a truthful mechanism with Pareto-optimal allocations in this important setting. We propose Sort-Cut, a mechanism which does the next best thing from the auctioneer's point of view, that we term {\em semi-truthful}. In our mechanism, it is a weakly dominant strategy for all agents to state their true budgets and to not understate their values. Thus the only way a bidder can possibly benefit from lying in a semi-truthful mechanism is by overstating their value, which leads to good revenue properties for the auctioneer at equilibria. While we are unable to give a complete characterization of equilibria for our mechanism, we prove that {\em some} equilibrium of the proposed mechanism optimizes the revenue over all Pareto-optimal mechanisms, and that this equilibrium is the unique one resulting from a natural rational bidding strategy (where every losing bidder bids at least her true value). The latter is similar in spirit to the approach of [Edelman et al. American Economic Review 2007] in their analysis of the equilibria of the Generalized Second Price (GSP) auction implemented for sponsored search ads at Google. Perhaps even more significantly, we show that the revenue of {\bf every} equilibrium of our mechanism differs by at most the budget of one bidder from the optimum revenue (under some mild assumptions). While earlier work on the problem led to mechanisms that leave some items unallocated [Borgs et al., EC'05], and were variants on earlier ideas of Goldberg et al. [Goldberg et al., SODA'01], Sort-Cut provides a new pricing idea and generalizes second-price auctions in a natural way.
Contributors: I. Hafalir, R. Ravi, A. Sayedi.
The question of how to publish an anonymized search log was brought to the forefront by a well-intentioned, but privacy-unaware AOL search log release. Since then a series of ad-hoc techniques have been proposed in the literature, though none are known to be provably private. We take a major step towards a solution: we show how queries, clicks and their associated perturbed counts can be published in a manner that rigorously preserves privacy. Our algorithm is decidedly simple to state, but non-trivial to analyze. On the opposite side of privacy is the question of whether the data we can safely publish is of any use. Our findings offer a glimmer of hope: we demonstrate that a non-negligible fraction of queries and clicks can indeed be safely published via a collection of experiments on a real search log. In addition, we select an application, keyword generation, and show that the keyword suggestions generated from the perturbed data resemble those generated from the original data.
Joint work with Krishnaram Kenthapadi, Nina Mishra, and Alexandros Ntoulas.
Combinatorial (weighted) counting problems, such as counting the number of satisfying assignments of a Boolean satisfiability problem or computing the partition function of a graphical model, can be expressed as tensor contractions diagrammed by a bipartite graph $\Gamma=(V,U,E)$. Many algorithms for solving these problems efficiently use tree structure and factorization over a semiring (the "sum-product algorithm"). Over a proper ring, cancellation can be exploited as well. This requires a change of basis so that the Grassmann-Plucker identities are locally satisfied, and the addition of orientation or ordering information. Recently, this Pfaffian-based approach has been used by Valiant and Cai to provide unexpected, polynomial-time "accidental" algorithms for certain counting problems. Their method expands the vertices of $\Gamma$ into graph fragments called matchgates and applies the FKT algorithm to find a Pfaffian orientation and compute the perfect matching polynomial. We demonstrate a simplification of this approach using only a $|E| \times |E|$ matrix and give some generalizations. Natural problems treatable by these generalizations have been previously considered in a different context, and we present one such example.
This is joint work with J.M. Landsberg and Serguei Norine.
A data stream model represents setting where approximating pairwise, or $k$-wise, independence with sublinear memory is of considerable importance. In the streaming model the joint distribution is given by a stream of $k$-tuples, with the goal of testing correlations among the components measured over the entire stream. Indyk and McGregor (SODA 08) recently gave exciting new results for measuring pairwise independence in the streaming model. The Indyk and McGregor methods provide $\log{n}$-approximation under statistical distance between the joint and product distributions in the streaming model. Indyk and McGregor leave, as their main open question, the problem of improving their $\log n$-approximation for the statistical distance metric. This talk covers our recent paper "Measuring Independence of Datasets". We solve the main open problem posed by of Indyk and McGregor for the statistical distance for pairwise independence and extend this result to any constant $k$. In particular, we present an algorithm that computes an $(\epsilon, \delta)$-approximation of the statistical distance between the joint and product distributions defined by a stream of $k$-tuples. Our algorithm requires $O(\left({1\over \epsilon}\log({nm\over \delta})\right)^{(30+k)^k})$ memory and a single pass over the data stream.
Joint work with Rafail Ostrovsky (UCLA).
We consider a robust model proposed by Scarf, 1958, for stochastic optimization when only the marginal probabilities are given, and the correlation between random variables is unknown. In the robust model, the objective is to minimize expected cost against worst possible joint distribution with those marginals. We provide efficient approximation algorithms for this model when the objective function a) is submodular b) admits a certain cost sharing or c) is supermodular in the random set. Further, we explore the effects of ignoring correlations, and precisely establish the approximations introduced by using independent distribution instead of the worst-case joint distribution. We show that for some classes of functions, independent distribution indeed forms a good substitute, while for others it can give exponentially higher cost. Our analysis is based on exploiting new connections with problems in social welfare maximization and existence of Walrasian equilibria, which may be of independent interest.
Contributors: S. Agrawal, Y. Ding, A. Saberi, Y. Ye
I will discuss the paper "How to use a short basis: Trapdoors for hard lattices and new cryptographic constructions" by Craig Gentry, Chris Peikert and Vinod Vaikuntanathan. This paper appeared in STOC 2008. The authors show how to construct a variety of “trapdoor” cryptographic tools assuming the worst-case hardness of standard lattice problems (such as approximating the length of the shortest nonzero vector to within certain polynomial factors). They introduce a new notion of preimage sampleable functions and build simple and efficient “hash-and-sign” digital signature schemes and an identity-based encryption scheme. A core technical component of their constructions is an efficient algorithm that, given a basis of an arbitrary lattice, samples lattice points from a discrete Gaussian probability distribution whose standard deviation is essentially the length of the longest Gram-Schmidt vector of the basis.
Monotone span programs express certain monotone boolean functions using linear algebra. I will examine what can be computed with monotone span programs, showing that they are strictly more powerful than both monotone circuits and monotone threshold circuits. I will also show applications to cryptography, specifically identity-based encryption and attribute-based encryption.
An attempt at expressing as equations the four constructive postulates of Euclid's Elements, namely 1, 2, 3, and 5, turns out unexpectedly to axiomatize exactly the vector spaces over the complex rationals, with their homomorphisms being the linear transformations. The Loewenheim-Skolem theorem ought to make this impossible.