On June 30th 2025, at ENS de Lyon (amphi A), as a Satellite event of the COLT2025 conference on learning theory , the SHARP and Foundry projects of the PEPR IA are co-organizing a one-day workshop on “Frugal and Robust Foundations for Machine Learning – Occam’s razor at the age of LLM’s”. The workshop will be dedicated to recent advances in frugal and robust learning.
The program mixes invited lectures and contributed presentations.
Starting time moved to 9:30am
Confirmed invited speakers :
- Frederik Mallmann-Trenn (King’s College London)
- Frederik is a Senior Lecturer (Associate Professor) at King’s College London, where he leads the Algorithms and Data Analysis group and directs the Random Lab. His research focuses on sparsification of neural networks, stochastic processes, and biological distributed computing.
- Title: The Strong Lottery Ticket Hypothesis and the Random Subset Sum Problem
- Abstract: The Strong Lottery Ticket Hypothesis (SLTH) challenges the necessity of weight updates in training neural networks. It posits that sufficiently large random neural networks already contain sparse subnetworks capable of approximating any target network of smaller size—without requiring any further weight training. In this talk, I will explore the SLTH as a theoretical framework that reimagines the role of initialization in deep learning. Building on this foundation, I will connect the SLTH to the Random Subset Sum Problem, a well-studied NP-hard problem in combinatorial optimization. By framing the search for these high-performing sparse subnetworks as a variant of the Random Subset Sum Problem, we can leverage insights from theoretical computer science to understand the likelihood of such subnetworks existing.
- Julia Gusak (Inria Bordeaux)
- Julia is a Research Scientist at Inria Bordeaux, specializing in efficient deep learning. Her current research focuses on scaling training under memory constraints using accuracy-preserving strategies such as re-materialization and parallelism. She also works with approximation techniques, including low-rank methods and quantization, with applications in both training and inference. Her broader interests include robustness and generalization of deep models.
- Title: Training Neural Networks Under Memory Constraints
- Abstract: The talk will focus on methods for training neural networks under memory constraints. We begin with a high-level overview of where memory and compute limitations arise during training, and common techniques used in practice to work within these constraints. We then present two recent approaches that improve on existing techniques. The first targets single-device training and is based on re-materialization, where intermediate values are recomputed during backpropagation instead of being stored during the forward pass. We introduce a hierarchical strategy that partitions large computation graphs and solves subgraphs independently to generate a global execution plan under memory constraints. This allows the method to handle more complex architectures with lower runtime overhead. The second approach addresses distributed training using pipeline parallelism. It integrates re-materialization into a memory-aware scheduling framework that operates across microbatches and devices. This enables fine-grained control over memory usage and allows execution plans to adapt to device-specific constraints. While prior work has explored local and heuristic strategies in pipeline settings, this approach computes memory-efficient schedules for the entire pipeline, enabling training of deeper models and longer sequences on fixed hardware. Together, these methods make it possible to train larger and more expressive models within given hardware memory limits, without requiring changes to model architecture or training objectives.
- Patrick Loiseau (Inria Saclay, École Polytechnique, ENSAE)
- Patrick is a Research Director at Inria and part-time Professor of Computer Science at École Polytechnique and ENSAE. He co-leads the FairPlay team, a joint initiative between Inria, Criteo, and ENSAE, focusing on fairness, explainability, and responsible AI. His research lies at the intersection of game theory and statistical learning, with applications in security, privacy, and ethics of online systems.
- Title: DU-Shapley: A Shapley Value Proxy for Efficient Dataset Valuation
- Abstract: We consider the dataset valuation problem, that is, the problem of quantifying the incremental gain, to some relevant pre-defined utility of a machine learning task, of aggregating an individual dataset to others. The Shapley value is a natural tool to perform dataset valuation due to its formal axiomatic justification, which can be combined with Monte Carlo integration to overcome the computational tractability challenges. Such generic approximation methods, however, remain expensive in some cases. In this paper, we exploit the knowledge about the structure of the dataset valuation problem to devise more efficient Shapley value estimators. We propose a novel approximation, referred to as discrete uniform Shapley, which is expressed as an expectation under a discrete uniform distribution with support of reasonable size. We justify the relevancy of the proposed framework via asymptotic and non-asymptotic theoretical guarantees and illustrate its benefits via an extensive set of numerical experiments.
Location: Amphi A, Ecole Normale Supérieure de Lyon, Site Monod, 46 allée d’Italie, no registration is needed.
Schedule
9h00 9:30 Opening
9h05 9:35 INVITED TALK 1: Frederik Mallmann-Trenn (King’s College London) • The Strong Lottery Ticket Hypothesis and the Random Subset Sum Problem
10h00 10:30 Mathurin Massias (Inria Lyon & ENS de Lyon) • Expectations vs reality – on the role of stochasticity in generalization of flow matching
A growing body of research aims to understand why recent generative models – such as diffusion and flow matching – generalize so effectively. Among the proposed explanations are the inductive biases of deep learning architectures and the stochastic nature of the conditional flow matching loss. In this work, we rule out the latter – the noisy nature of the loss – as a primary contributor to generalization in flow matching. First, we empirically show that in high-dimensional settings, the stochastic and closed-form versions of the flow matching loss yield nearly equivalent losses. Then, we demonstrate that both variants achieve comparable statistical performance, with the surprising observation that using the closed-form can even improve performance.
Authors: Quentin Bertrand (Inria), Anne Gagneux (ENS de Lyon), Mathurin Massias (Inria), Rémi Emonet (Université Jean Monnet)
10h30 11h00 Coffee break
11h00 11h30 INVITED TALK 2: Julia Gusak (Inria Bordeaux)• Training Neural Networks Under Memory Constraints12h00 12h30 Lunch break
13h30 14h00 INVITED TALK 3: Patrick Loiseau (Inria Saclay, École Polytechnique, ENSAE)• DU-Shapley: A Shapley Value Proxy for Efficient Dataset Valuation
14h50 Pierre-Louis Cauvin (Université Grenoble Alpes)• The Impact of Uncertainty on Regularized Learning in Games
We investigate how randomness affects learning in games by examining a perturbed variant of the continuous-time FTRL dynamics. Our findings reveal that “uncertainty favors extremes”: every player’s choices approach pure strategies in finite time. Moreover, we show that (a) the only possible limits of the perturbed dynamics are pure Nash equilibria; and (b) a span of pure strategies is stable and attracting if and only if it is closed under better replies. Finally, we prove that in games where the deterministic dynamics are recurrent, random shocks cause trajectories to drift toward the boundary on average.
Authors: Pierre-Louis Cauvin, Davide Legacci, Panayotis Mertikopoulos (Univ. Grenoble Alpes, LIG)
Note: This work has been accepted at ICML 2025 (poster)
15h20 Charlotte Laclau (Télécom Paris, Institut Polytechnique de Paris) • Seeking for long-term fairness
15h50 Closing