This workshop aims to cover recent developments in the area of “algorithms with
predictions” (aka learning-augmented algorithms or data driven algorithms). These methods show how to
parameterize algorithms so that they can adapt their behavior to the properties of the input
distribution and consequently improve their performance, such as runtime, space, or quality of the
solution.
Generally speaking, a result in this area takes a problem with strong computational lower bounds (for
instance on the competitive ratio), identifies a compact prediction that can be learned from real data,
and gives a proof tying the performance of the algorithm to the quality of the underlying prediction.
The field has blossomed with applications to classical streaming algorithms, online scheduling,
clustering, filtering data structures, and many others. All of these methods guarantee improved
performance when the predictions are good, and maintain nearly identical worst-case guarantees when
they are not.
The workshop will cover recent advances in different domains, and introduce newcomers to open problems
in this area.
When:
June 21, 23, 24, 2022 (9am -- 12pm CET -- Rome Time)
Flow Time Scheduling with Uncertain Processing Time
We consider the problem of online scheduling on a single machine to minimize unweighted and weighted flow time. The existing algorithms for these problems require exact knowledge of the processing time of each job. This assumption is crucial, as even a slight perturbation of the processing time would lead to polynomial competitive ratio. However, this assumption very rarely holds in real-life scenarios. We present a competitive algorithm (the competitive ratio is a function of the distortion) for unweighted flow time that do not require knowledge of the distortion in advance. For the weighted flow time we present competitive algorithms but, in this case, we need to know (an upper bound on) the distortion in advance. This is joint work with Stefano Leonardi and Noam Touitou based on papers appear in STOC 21 and SODA 2022.
10:45 -- 11:15
Benjamin Moseley
Machine Learning for Faster Optimization
This talk will discuss a model for augmenting algorithms with useful predictions to improve algorithm performance for running time. The model ensures predictions are formally learnable and robust. Learnability guarantees that predictions can be efficiently constructed from past data. Robustness formally ensures a prediction is robust to modest changes in the problem input. This talk will discuss predictions that satisfy these properties and result in improved run times for matching algorithms.
11:20 -- 11:50
Ravi Kumar
Parsimonious Learning-Augmented Algorithms
In this talk we will consider the number of predictions used by a learning-augmented algorithm as a resource measure. Focusing on two problems---online learning in the regret setting and online caching in the competitive-ratio setting---we show that a sublinear number of predictions suffices to achieve non-trivial performance gains.
Thursday, June 23rd
09:00 -- 09:30
Debmalya Panigrahi
Online Algorithms with Multiple Advice
The bulk of the literature on online algorithms with ML advice focuses on a single (possibly noisy) predictor. But, in reality, multiple ML models are often used to make future predictions for the same application, and their predictions/advice can differ from one another. In this case, how should an online algorithm choose among these different suggestions? This talk will focus on models, problems, and algorithms to address this question. The talk will include some recent results with Keerti Anand, Rong Ge, and Amit Kumar, and briefly touch upon older results with Sreenivas Gollapudi.
09:35 -- 10:05
Ola Svensson
Learning-Augmented Online Algorithms and the Primal-Dual Method
The design of learning-augmented online algorithms is a new and active research area. The goal is to understand how to best incorporate predictions of the future provided e.g. by machine learning algorithms that rarely come with guarantees on their accuracy. In the absence of guarantees, the difficulty in the design of such learning-augmented algorithms is to find a good balance: on the one hand, following blindly the prediction might lead to a very bad solution if the prediction is misleading. On the other hand, if the algorithm does not trust the prediction at all, it will simply never benefit from an excellent prediction. An explosion of recent results solve this issue by designing smart algorithms that exploit the problem structure to achieve a good trade-off between these two cases. In this talk, we will discuss this emerging line of work. In particular, we will show how to unify and generalize some of these results by extending the powerful primal-dual method for online algorithms to the learning augmented setting. This is joint work with Etienne Bamas and Andreas Maggiori.
10:45 -- 11:15
Ronitt Rubinfield
Oracles in distribution testing
Distribution testing problems often require an incredible number of samples. However, when useful oracles answering various questions about the distributions are available, dramatically fewer samples are needed. In this talk, I will give a survey of such results, including scenarios in which approximate oracles are learned from the data.
11:20 -- 11:50
Panel
Moderated by Michael Mitzenmacher
Friday, June 24th
09:00 -- 09:30
Stefano Leonardi
Prophet inequalities and mechanism design with a single sample.
We present recent results on prophet inequalities for matching markets and two-sided mechanism design that provide near optimal approximations when limited information in the form of a single sample from each prior distribution is available.
09:35 -- 10:05
Kapil Vaidya
SNARF: A Learning-Enhanced Range Filter
We present Sparse Numerical Array-Based Range Filters (SNARF),
a learned range filter that efficiently supports range queries for numerical data. SNARF creates a model of the data distribution to
map the keys into a bit array which is stored in a compressed form.
The model along with the compressed bit array which constitutes
SNARF are used to answer membership queries.
We evaluate SNARF on multiple synthetic and real-world datasets
as a stand-alone filter and by integrating it into RocksDB. For range
queries, SNARF provides up to 50x better false positive rate than
state-of-the-art range filters, such as SuRF and Rosetta, with the
same space usage. We also evaluate SNARF in RocksDB as a filter
replacement for filtering requests before they access on-disk data
structures. For RocksDB, SNARF can improve the execution time
of the system up to 10x compared to SuRF and Rosetta for certain
read-only workloads.
10:45 -- 11:15
Ellen Vitercik
Tree Search Configuration: Cutting Planes and Beyond
Cutting-plane methods have enabled remarkable successes in integer programming over the last few decades. State-of-the-art solvers integrate a myriad of cutting-plane techniques to speed up the underlying tree search algorithm used to find optimal solutions. In this talk, we provide the first sample complexity guarantees for learning high-performing cut-selection policies tailored to the instance distribution at hand using samples. We then develop a general abstraction of tree search that captures key components such as node selection and variable selection. For this abstraction, we bound the sample complexity of learning a good policy for building the search tree.This talk is based on joint work with Nina Balcan, Siddharth Prasad, and Tuomas Sandholm.
11:20 -- 11:50
Isaac Grosof
Stochastic Scheduling with Predictions
We consider the problem of scheduling to minimize mean response time in a queueing system with Poisson arrivals and i.i.d. sizes (processing times), where predictions of size are known to the scheduler. We consider the setting of bounded multiplicative prediction error. We evaluate a policy by its approximation error, the ratio between its mean response time, and that of the optimal policy when sizes are exactly known. We present a simple, novel scheduling policy which achieves both consistency and graceful degradation. This is joint work with Ziv Scully and Michael Mitzenmacher based on a paper in ITCS 2022.
Previous Workshops:
Workshop on Data-driven Algorithmics
by Andreas Krause, Pavlos Protopapas and Yaron Singer, Harvard, 2015.
[workshop webpage]
Workshop on Data-driven Algorithmics
by Andreas Krause and Yaron Singer, Bertinoro, 2017.
[workshop webpage]
Workshop on Automated Algorithm Design
by Nina Balcan, Bistra Dilkina, Carl Kingsford and Paul Medvedev, TTIC,
2019.
[workshop webpage]
Workshop on Learning-Based Algorithms
by Piotr Indyk, Yaron Singer, Ali Vakilian and Sergei Vassilvitskii,
TTIC, 2019.
[workshop webpage]
Workshop on Algorithms with Predictions
by Piotr Indyk, Yaron Singer, Ali Vakilian and Sergei Vassilvitskii, STOC
2020.
[workshop webpage]
Machine Learning for Algorithms (ML4A)
by Costis Daskalakis, Paul Hand, Piotr Indyk, Michael Mitzenmacher, Jelani
Nelson and Ronitt Rubinfeld, FODSI, 2021.
[workshop webpage]