Workshop on Algorithms with Predictions

STOC'22 Workshop on Algorithms with Predictions

This workshop aims to cover recent developments in the area of “algorithms with predictions” (aka learning-augmented algorithms or data driven algorithms). These methods show how to parameterize algorithms so that they can adapt their behavior to the properties of the input distribution and consequently improve their performance, such as runtime, space, or quality of the solution.

Generally speaking, a result in this area takes a problem with strong computational lower bounds (for instance on the competitive ratio), identifies a compact prediction that can be learned from real data, and gives a proof tying the performance of the algorithm to the quality of the underlying prediction. The field has blossomed with applications to classical streaming algorithms, online scheduling, clustering, filtering data structures, and many others. All of these methods guarantee improved performance when the predictions are good, and maintain nearly identical worst-case guarantees when they are not.

The workshop will cover recent advances in different domains, and introduce newcomers to open problems in this area.

When:	June 21, 23, 24, 2022 (9am -- 12pm CET -- Rome Time)
Organizers:	Piotr Indyk (MIT), Michael Mitzenmacher (Harvard), Sergei Vassilvitskii (Google NYC)

Invited Speakers:

Yossi Azar

Ravi Kumar

Stefano Leonardi

Benjamin Moseley

Debmalya Panigrahi

Ronitt Rubinfeld

Ola Svensson

Kapil Vaidya

Ellen Vitercik

Schedule


Tuesday, June 21st
09:00 -- 09:30	Michael Mitzenmacher	Opening Remarks and Area Overview
09:35 -- 10:05	Yossi Azar	Flow Time Scheduling with Uncertain Processing Time [abstract] We consider the problem of online scheduling on a single machine to minimize unweighted and weighted flow time. The existing algorithms for these problems require exact knowledge of the processing time of each job. This assumption is crucial, as even a slight perturbation of the processing time would lead to polynomial competitive ratio. However, this assumption very rarely holds in real-life scenarios. We present a competitive algorithm (the competitive ratio is a function of the distortion) for unweighted flow time that do not require knowledge of the distortion in advance. For the weighted flow time we present competitive algorithms but, in this case, we need to know (an upper bound on) the distortion in advance. This is joint work with Stefano Leonardi and Noam Touitou based on papers appear in STOC 21 and SODA 2022.
10:45 -- 11:15	Benjamin Moseley	Machine Learning for Faster Optimization [abstract] This talk will discuss a model for augmenting algorithms with useful predictions to improve algorithm performance for running time. The model ensures predictions are formally learnable and robust. Learnability guarantees that predictions can be efficiently constructed from past data. Robustness formally ensures a prediction is robust to modest changes in the problem input. This talk will discuss predictions that satisfy these properties and result in improved run times for matching algorithms.
11:20 -- 11:50	Ravi Kumar	Parsimonious Learning-Augmented Algorithms [abstract] In this talk we will consider the number of predictions used by a learning-augmented algorithm as a resource measure. Focusing on two problems---online learning in the regret setting and online caching in the competitive-ratio setting---we show that a sublinear number of predictions suffices to achieve non-trivial performance gains.
Thursday, June 23rd
09:00 -- 09:30	Debmalya Panigrahi	Online Algorithms with Multiple Advice [abstract] The bulk of the literature on online algorithms with ML advice focuses on a single (possibly noisy) predictor. But, in reality, multiple ML models are often used to make future predictions for the same application, and their predictions/advice can differ from one another. In this case, how should an online algorithm choose among these different suggestions? This talk will focus on models, problems, and algorithms to address this question. The talk will include some recent results with Keerti Anand, Rong Ge, and Amit Kumar, and briefly touch upon older results with Sreenivas Gollapudi.
09:35 -- 10:05	Ola Svensson	Learning-Augmented Online Algorithms and the Primal-Dual Method [abstract] The design of learning-augmented online algorithms is a new and active research area. The goal is to understand how to best incorporate predictions of the future provided e.g. by machine learning algorithms that rarely come with guarantees on their accuracy. In the absence of guarantees, the difficulty in the design of such learning-augmented algorithms is to find a good balance: on the one hand, following blindly the prediction might lead to a very bad solution if the prediction is misleading. On the other hand, if the algorithm does not trust the prediction at all, it will simply never benefit from an excellent prediction. An explosion of recent results solve this issue by designing smart algorithms that exploit the problem structure to achieve a good trade-off between these two cases. In this talk, we will discuss this emerging line of work. In particular, we will show how to unify and generalize some of these results by extending the powerful primal-dual method for online algorithms to the learning augmented setting. This is joint work with Etienne Bamas and Andreas Maggiori.
10:45 -- 11:15	Ronitt Rubinfield	Oracles in distribution testing [abstract] Distribution testing problems often require an incredible number of samples. However, when useful oracles answering various questions about the distributions are available, dramatically fewer samples are needed. In this talk, I will give a survey of such results, including scenarios in which approximate oracles are learned from the data.
11:20 -- 11:50	Panel	Moderated by Michael Mitzenmacher
Friday, June 24th
09:00 -- 09:30	Stefano Leonardi	Prophet inequalities and mechanism design with a single sample. [abstract] We present recent results on prophet inequalities for matching markets and two-sided mechanism design that provide near optimal approximations when limited information in the form of a single sample from each prior distribution is available.
09:35 -- 10:05	Kapil Vaidya	SNARF: A Learning-Enhanced Range Filter [abstract] We present Sparse Numerical Array-Based Range Filters (SNARF), a learned range filter that efficiently supports range queries for numerical data. SNARF creates a model of the data distribution to map the keys into a bit array which is stored in a compressed form. The model along with the compressed bit array which constitutes SNARF are used to answer membership queries. We evaluate SNARF on multiple synthetic and real-world datasets as a stand-alone filter and by integrating it into RocksDB. For range queries, SNARF provides up to 50x better false positive rate than state-of-the-art range filters, such as SuRF and Rosetta, with the same space usage. We also evaluate SNARF in RocksDB as a filter replacement for filtering requests before they access on-disk data structures. For RocksDB, SNARF can improve the execution time of the system up to 10x compared to SuRF and Rosetta for certain read-only workloads.
10:45 -- 11:15	Ellen Vitercik	Tree Search Configuration: Cutting Planes and Beyond [abstract] Cutting-plane methods have enabled remarkable successes in integer programming over the last few decades. State-of-the-art solvers integrate a myriad of cutting-plane techniques to speed up the underlying tree search algorithm used to find optimal solutions. In this talk, we provide the first sample complexity guarantees for learning high-performing cut-selection policies tailored to the instance distribution at hand using samples. We then develop a general abstraction of tree search that captures key components such as node selection and variable selection. For this abstraction, we bound the sample complexity of learning a good policy for building the search tree.This talk is based on joint work with Nina Balcan, Siddharth Prasad, and Tuomas Sandholm.
11:20 -- 11:50	Isaac Grosof	Stochastic Scheduling with Predictions [abstract] We consider the problem of scheduling to minimize mean response time in a queueing system with Poisson arrivals and i.i.d. sizes (processing times), where predictions of size are known to the scheduler. We consider the setting of bounded multiplicative prediction error. We evaluate a policy by its approximation error, the ratio between its mean response time, and that of the optimal policy when sizes are exactly known. We present a simple, novel scheduling policy which achieves both consistency and graceful degradation. This is joint work with Ziv Scully and Michael Mitzenmacher based on a paper in ITCS 2022.

Previous Workshops:

Workshop on Data-driven Algorithmics by Andreas Krause, Pavlos Protopapas and Yaron Singer, Harvard, 2015. [workshop webpage]
Workshop on Data-driven Algorithmics by Andreas Krause and Yaron Singer, Bertinoro, 2017. [workshop webpage]
Workshop on Automated Algorithm Design by Nina Balcan, Bistra Dilkina, Carl Kingsford and Paul Medvedev, TTIC, 2019. [workshop webpage]
Workshop on Learning-Based Algorithms by Piotr Indyk, Yaron Singer, Ali Vakilian and Sergei Vassilvitskii, TTIC, 2019. [workshop webpage]
Workshop on Algorithms with Predictions by Piotr Indyk, Yaron Singer, Ali Vakilian and Sergei Vassilvitskii, STOC 2020. [workshop webpage]
Machine Learning for Algorithms (ML4A) by Costis Daskalakis, Paul Hand, Piotr Indyk, Michael Mitzenmacher, Jelani Nelson and Ronitt Rubinfeld, FODSI, 2021. [workshop webpage]