

{"id":149,"date":"2025-06-06T13:27:20","date_gmt":"2025-06-06T11:27:20","guid":{"rendered":"https:\/\/project.inria.fr\/sharp\/?p=149"},"modified":"2026-03-09T15:05:21","modified_gmt":"2026-03-09T14:05:21","slug":"sharpfoundry-colt25","status":"publish","type":"post","link":"https:\/\/project.inria.fr\/sharp\/sharpfoundry-colt25\/","title":{"rendered":"SHARP+Foundry workshop @COLT 2025 in Lyon, June 30th 2025"},"content":{"rendered":"<p><strong>On June 30th 2025<\/strong>, <strong>at ENS de Lyon<\/strong> <strong>(amphi A)<\/strong>, as a Satellite event of the <a href=\"https:\/\/learningtheory.org\/colt2025\/\">COLT2025 conference<\/a> on learning theory , the <a href=\"https:\/\/project.inria.fr\/sharp\/\" data-type=\"page\" data-id=\"4\">SHARP<\/a> and <a href=\"https:\/\/www.pepr-ia.fr\/projet\/foundry\/\">Foundry<\/a> projects of the <a href=\"https:\/\/www.pepr-ia.fr\/projet\/foundry\/\">PEPR IA<\/a> are co-organizing a one-day workshop on &#8220;Frugal and Robust Foundations for Machine Learning \u2013 Occam&#8217;s razor at the age of LLM&#8217;s&#8221;. The workshop will be dedicated to recent advances in frugal and robust learning. <\/p>\n\n\n\n<p>The program mixes invited lectures and contributed presentations.<\/p>\n\n\n\n<p><strong>Starting time moved to 9:30am<\/strong><\/p>\n\n\n\n<h2 class=\"wp-block-heading\">Confirmed invited speakers :<\/h2>\n\n\n\n<ul class=\"wp-block-list\"><li><a href=\"https:\/\/sites.google.com\/view\/mallmann\/\" data-type=\"URL\" data-id=\"https:\/\/sites.google.com\/view\/mallmann\/\">Frederik Mallmann-Trenn<\/a> (King&#8217;s College London)<ul><li>Frederik is a Senior Lecturer (Associate Professor) at King\u2019s College London, where he leads the Algorithms and Data Analysis group and directs the Random Lab. His research focuses on&nbsp;<strong>sparsification of neural networks<\/strong>,&nbsp;<strong>stochastic processes<\/strong>, and&nbsp;<strong>biological distributed computing<\/strong>.<\/li><li><strong>Title:<\/strong>&nbsp;The Strong Lottery Ticket Hypothesis and the Random Subset Sum Problem <\/li><li><strong>Abstract:<\/strong> The Strong Lottery Ticket Hypothesis (SLTH) challenges the necessity of weight updates in training neural networks. It posits that sufficiently large random neural networks already contain sparse subnetworks capable of approximating any target network of smaller size\u2014without requiring any further weight training. In this talk, I will explore the SLTH as a theoretical framework that reimagines the role of initialization in deep learning. Building on this foundation, I will connect the SLTH to the Random Subset Sum Problem, a well-studied NP-hard problem in combinatorial optimization. By framing the search for these high-performing sparse subnetworks as a variant of the Random Subset Sum Problem, we can leverage insights from theoretical computer science to understand the likelihood of such subnetworks existing.<\/li><\/ul><\/li><\/ul>\n\n\n\n<ul class=\"wp-block-list\"><li><a href=\"https:\/\/juliagusak.github.io\/about\/\" data-type=\"URL\" data-id=\"https:\/\/juliagusak.github.io\/about\/\">Julia Gusak<\/a> (Inria Bordeaux)<ul><li>Julia is a Research Scientist at Inria Bordeaux, specializing in efficient deep learning. Her current research focuses on s<strong>caling training under memory constraints using accuracy-preserving strategies<\/strong> such as re-materialization and parallelism. She also works with approximation techniques, including <strong>low-rank methods and quantization<\/strong>, with applications in both training and inference. Her broader interests include r<strong>obustness and generalization of deep models<\/strong>.<\/li><li><strong>Title<\/strong>: Training Neural Networks Under Memory Constraints<\/li><li><strong>Abstract<\/strong>: The talk will focus on methods for training neural networks under memory constraints. We begin with a high-level overview of where memory and compute limitations arise during training, and common techniques used in practice to work within these constraints. We then present two recent approaches that improve on existing techniques. The first targets single-device training and is based on re-materialization, where intermediate values are recomputed during backpropagation instead of being stored during the forward pass. We introduce a hierarchical strategy that partitions large computation graphs and solves subgraphs independently to generate a global execution plan under memory constraints. This allows the method to handle more complex architectures with lower runtime overhead. The second approach addresses distributed training using pipeline parallelism. It integrates re-materialization into a memory-aware scheduling framework that operates across microbatches and devices. This enables fine-grained control over memory usage and allows execution plans to adapt to device-specific constraints. While prior work has explored local and heuristic strategies in pipeline settings, this approach computes memory-efficient schedules for the entire pipeline, enabling training of deeper models and longer sequences on fixed hardware. Together, these methods make it possible to train larger and more expressive models within given hardware memory limits, without requiring changes to model architecture or training objectives.<\/li><\/ul><\/li><\/ul>\n\n\n\n<ul class=\"wp-block-list\"><li><a href=\"https:\/\/patrickloiseau.github.io\/\">Patrick Loiseau<\/a> (Inria Saclay, \u00c9cole Polytechnique, ENSAE)<ul><li>Patrick is a Research Director at Inria and part-time Professor of Computer Science at \u00c9cole Polytechnique and ENSAE. He co-leads the&nbsp;<strong>FairPlay<\/strong>&nbsp;team, a joint initiative between Inria, Criteo, and ENSAE, focusing on&nbsp;<strong>fairness, explainability, and responsible AI<\/strong>. His research lies at the intersection of&nbsp;<strong>game theory<\/strong>&nbsp;and&nbsp;<strong>statistical learning<\/strong>, with applications in&nbsp;<strong>security<\/strong>,&nbsp;<strong>privacy<\/strong>, and&nbsp;<strong>ethics of online systems<\/strong>.<\/li><li><strong>Title<\/strong>: DU-Shapley: A Shapley Value Proxy for Efficient Dataset Valuation<\/li><li><strong>Abstract<\/strong>: We consider the dataset valuation problem, that is, the problem of quantifying the incremental gain, to some relevant pre-defined utility of a machine learning task, of aggregating an individual dataset to others. The Shapley value is a natural tool to perform dataset valuation due to its formal axiomatic justification, which can be combined with Monte Carlo integration to overcome the computational tractability challenges. Such generic approximation methods, however, remain expensive in some cases. In this paper, we exploit the knowledge about the structure of the dataset valuation problem to devise more efficient Shapley value estimators. We propose a novel approximation, referred to as discrete uniform Shapley, which is expressed as an expectation under a discrete uniform distribution with support of reasonable size. We justify the relevancy of the proposed framework via asymptotic and non-asymptotic theoretical guarantees and illustrate its benefits via an extensive set of numerical experiments.<\/li><\/ul><\/li><\/ul>\n\n\n\n<p><strong>Location: Amphi A, Ecole Normale Sup\u00e9rieure de Lyon, Site Monod, 46 all\u00e9e d&#8217;Italie<\/strong>, no registration is needed.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\">Schedule<\/h2>\n\n\n\n<p><s>9h00<\/s> 9:30 Opening<\/p>\n\n\n\n<p><strong><s>9h05<\/s> 9:35 INVITED TALK 1:  Frederik Mallmann-Trenn&nbsp;(King\u2019s College London)<\/strong> \u2022 &nbsp;<a href=\"https:\/\/project.inria.fr\/sharp\/files\/2025\/07\/COLT.pdf\">The Strong Lottery Ticket Hypothesis and the Random Subset Sum Problem&nbsp;<\/a><\/p>\n\n\n\n<p><strong><s>10h00<\/s> 10:30 Mathurin Massias (Inria Lyon &amp; ENS de Lyon) <\/strong>\u2022 &nbsp;<a href=\"https:\/\/project.inria.fr\/sharp\/files\/2025\/07\/sharpfoundry_MASSIAS.pdf\">Expectations vs reality &#8211;&nbsp;on&nbsp;the&nbsp;role&nbsp;of&nbsp;stochasticity&nbsp;in generalization of flow matching<\/a><\/p>\n\n\n\n<p><em>A growing body of research aims to understand why recent generative models \u2013 such as diffusion and flow matching \u2013 generalize so effectively.&nbsp;Among the proposed explanations are the inductive biases of deep learning architectures and the stochastic nature of the conditional flow matching loss. In this work, we rule out the latter \u2013 the noisy nature of the loss \u2013 as a primary contributor to generalization in flow matching. First, we empirically show that in high-dimensional settings, the stochastic and closed-form versions of the flow matching loss yield nearly equivalent losses. Then,&nbsp; we demonstrate that both variants achieve comparable statistical performance, with the surprising observation that using the closed-form can even improve performance.<\/em><\/p>\n\n\n\n<p>Authors: Quentin Bertrand (Inria), Anne Gagneux (ENS de Lyon), Mathurin Massias (Inria), R\u00e9mi Emonet (Universit\u00e9 Jean Monnet)<\/p>\n\n\n\n<p><s>10h30<\/s> 11h00 Coffee break<\/p>\n\n\n\n<p><strong><s>11h00<\/s> 11h30 INVITED TALK 2: Julia Gusak&nbsp;(Inria Bordeaux)<\/strong>\u2022 Training Neural Networks Under Memory Constraints<br><s>12h00<\/s> 12h30 Lunch break<\/p>\n\n\n\n<p><strong><s>13h30<\/s> 14h00 INVITED TALK 3: Patrick Loiseau\u00a0(Inria Saclay, \u00c9cole Polytechnique, ENSAE)<\/strong>\u2022 DU-Shapley: A Shapley Value Proxy for Efficient Dataset Valuation<br><strong>14h50 Pierre-Louis Cauvin (Universit\u00e9 Grenoble Alpes)<\/strong>\u2022 <a href=\"https:\/\/project.inria.fr\/sharp\/files\/2025\/07\/Learning_under_uncertainty.pdf\">The Impact of Uncertainty on Regularized Learning in Games<\/a><\/p>\n\n\n\n<p><em>We investigate how randomness affects learning in games by examining a perturbed variant of the continuous-time FTRL dynamics. Our findings reveal that &#8220;uncertainty favors extremes&#8221;: every player&#8217;s choices approach pure strategies in finite time. Moreover, we show that (a) the only possible limits of the perturbed dynamics are pure Nash equilibria; and (b) a span of pure strategies is stable and attracting if and only if it is closed under better replies. Finally, we prove that in games where the deterministic dynamics are recurrent, random shocks cause trajectories to drift toward the boundary on average.<\/em><\/p>\n\n\n\n<p>Authors<strong>:<\/strong>&nbsp;Pierre-Louis Cauvin, Davide Legacci, Panayotis Mertikopoulos (Univ. Grenoble Alpes, LIG)&nbsp;<\/p>\n\n\n\n<p>Note: This work has been accepted at ICML 2025 (poster)<br><strong>15h20 Charlotte Laclau (T\u00e9l\u00e9com Paris, Institut Polytechnique de Paris) <\/strong>\u2022 Seeking for long-term fairness<\/p>\n\n\n\n<p>15h50 Closing<\/p>\n\n\n\n<p><\/p>","protected":false},"excerpt":{"rendered":"<p>On June 30th 2025, at ENS de Lyon (amphi A), as a Satellite event of the COLT2025 conference on learning theory , the SHARP and Foundry projects of the PEPR IA are co-organizing a one-day workshop on &#8220;Frugal and Robust Foundations for Machine Learning \u2013 Occam&#8217;s razor at the age\u2026<\/p>\n<p> <a class=\"continue-reading-link\" href=\"https:\/\/project.inria.fr\/sharp\/sharpfoundry-colt25\/\"><span>Continue reading<\/span><i class=\"crycon-right-dir\"><\/i><\/a> <\/p>\n","protected":false},"author":1112,"featured_media":0,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[1],"tags":[],"class_list":["post-149","post","type-post","status-publish","format-standard","hentry","category-uncategorized"],"_links":{"self":[{"href":"https:\/\/project.inria.fr\/sharp\/wp-json\/wp\/v2\/posts\/149","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/project.inria.fr\/sharp\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/project.inria.fr\/sharp\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/project.inria.fr\/sharp\/wp-json\/wp\/v2\/users\/1112"}],"replies":[{"embeddable":true,"href":"https:\/\/project.inria.fr\/sharp\/wp-json\/wp\/v2\/comments?post=149"}],"version-history":[{"count":18,"href":"https:\/\/project.inria.fr\/sharp\/wp-json\/wp\/v2\/posts\/149\/revisions"}],"predecessor-version":[{"id":184,"href":"https:\/\/project.inria.fr\/sharp\/wp-json\/wp\/v2\/posts\/149\/revisions\/184"}],"wp:attachment":[{"href":"https:\/\/project.inria.fr\/sharp\/wp-json\/wp\/v2\/media?parent=149"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/project.inria.fr\/sharp\/wp-json\/wp\/v2\/categories?post=149"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/project.inria.fr\/sharp\/wp-json\/wp\/v2\/tags?post=149"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}