Invited Speakers

Henry
Moss
(University of Cambridge)
Anthony
Nouy
(Centrale Nantes)

Patricia
Reynaud-Bouret
(Université Côte d’Azur)
Alessandro
Rudi
(ENS Paris)
Maria Strazzullo (Politecnico Torino)
Aretha
Teckentrup
(University of Edinburgh)
Anna Maria
Massone
(University of Genova) (Invited Lecture)
Chiara
Tommasi
(University of Milan)

Dario AZZIMONTI

A short overview of preference learning with Gaussian process based approaches

In many real-world applications, users express their preferences via direct choices. For example suppose that we aim to model a user’s favoured mode of transportation. We could conduct a survey where each participant specifies their favored mode of travel. We can then learn the relationship between a user’s features and their preferences. This task can be achieved by using a Gaussian process (GP) model with an appropriate likelihood. In this talk we will review the fundamental properties of preference theory and its association with utility functions. We will show hat depending on which properties are satisfied, a different GP model is more appropriate.
Furthermore, we will extend beyond binary preferences to encompass choices among items within a finite set. We will demonstrate how a GP odel, equipped with a customized likelihood, effectively learns from such structured data. Finally we will highlight some applications of these methods to Bayesian optimization tasks. By integrating GP-based preference models, Bayesian optimization benefits from a nuanced understanding of user choices, facilitating efficient decision-making in complex, uncertain domains.


Henry MOSS

An Automatic Climate Scientist: Using Gaussian processes to uncover the secrets of the universe

There is a significant disconnect between modern ML methods and their ability to assist climate scientists in gaining deeper scientific understanding from observational data. Specifically, the challenge lies in distilling the fundamental equations governing physical phenomena — a goal known as equation discovery. In this talk we will explore how we can use Gaussian processes to uncover deep insights into intricate physical systems from measurement data, producing human-readable formula that can be easily trusted, understood, and verified by domain experts.


Anthony NOUY

Optimal sampling for linear and nonlinear approximation

We consider the approximation of functions from point evaluations, using linear or nonlinear approximation tools. For linear approximation, recent results show that weighted least-squares projections allow to obtain quasi-optimal approximations in L2 with near to optimal sampling budget. This can be achieved by drawing i.i.d. samples from suitable distributions (depending on the linear approximation tool) and subsampling methods [1,2].
In a first part of this talk, we review different strategies based on i.i.d. sampling and present alternative strategies based on repulsive point processes that allow to perform the same task with a reduced sampling complexity [3].
In a second part, we show how these methods can be used to approximate functions with nonlinear approximation tools, in an active learning setting, by coupling iterative algorithms on manifolds and optimal sampling methods for the (quasi-)projection onto successive linear spaces [4].
The proposed algorithm can be interpreted as a stochastic gradient method using optimal sampling, with provable convergence properties under classical convexity and smoothness assumptions. It can also be interpreted as a natural gradient descent on a manifold embedded in L2, which appears to be a Newton-type algorithm when written in terms of the coordinates of a parametrized manifold.
We discuss applications of this algorithm to learning of neural networks and tree tensor networks.

These are joint works with R. Gruhlke, B. Michel, C. Miranda and P. Trunschke

References:
[1] M. Dolbeault, D. Krieg, and M. Ullrich. A sharp upper bound for sampling numbers in L2, Applied and Computational Harmonic Analysis, 63 (2023), 113–134.
[2] C. Haberstich, A. Nouy, and G. Perrin. Boosted optimal weighted least- squares, Mathematics of Computation, 91(335) (2022), 1281–1315.
[3] A. Nouy and B. Michel. Weighted least-squares approximation with determinantal point processes and generalized volume sampling, 2023.
[4] T. Gruhlke, A. Nouy and P. Trunschke. Optimal sampling for stochastic and natural gradient
descent. In preparation.


Anna Maria Massone

Image reconstruction methods and machine learning techniques: applications to astronomical imaging and space weather

The Sun is an enigmatic star that produces some of the most powerful explosive events in the heliosphere, such as solar flares and coronal mass ejections (CMEs). Studying these eruptions can provide a unique opportunity to better understand both fundamental processes on the Sun and their space weather impacts at Earth. This tutorial aims at providing an introduction to mathematical models and computational tools that can be a crucial key to the comprehension of these open science issues.

More specifically, the first problem that will be addressed is the reconstruction of hard X-ray images of solar flares from data provided by the Spectrometer/Telescope for Imaging X-rays (STIX) telescope, as part of the European Space Agency (ESA) Solar Orbiter mission. STIX imaging concept exploits an indirect Fourier technique in which the native form of the measured data is a set of Fourier components, called visibilities, of the emitted X-ray flux. The image reconstruction problem for STIX is therefore the linear Fourier inversion problem from limited data that can be addressed by means of a variety of regularization methods.

The problem addressed in the second part of this tutorial is concerned with the prediction of solar phenomena that, in the near-Earth environment, may dramatically affect ground- and space-based systems for communications, navigation, and power distribution. In this perspective machine learning (ML) and deep learning (DL) offer great potential to learn the characteristics of the Sun-Earth system, to predict space weather impacts on timescales ranging from hours to days, and to improve the accuracy of the propagation models. ML/DL fully data-driven approaches, so as approaches combining the computational effectiveness of Artificial Intelligence with physical knowledge will be illustrated.


Patricia REYNAUD-BOURET

Theoretical and practical implications of thre Kalikow decomposition in the study of neuronal networks: simulation, statistics and learning

Kalikow decomposition is a decomposition of stochastic processes (usually finite state discrete time processes but also more recently point processes) that consists in picking at random a finite neighborhood in the past and then make a transition in a Markov manner. This kind of approach has been used for many years to prove existence of some processes, especially their stationary distribution. In particular, it allows to prove the existence of processes that model infinite neuronal networks, such as Hawkes like processes or Galvès-Löcherbach processes. But beyond mere existence, this decomposition is a wonderful tool to simulate such network, as an open physical system, that from a computational point of view could be competitive with the most performant brain simulations. Kalikow decomposition also leads to concentration inequalities, which allows us to understand the mathematical properties of estimation of such networks via Lasso procedures. Finally this decomposition is also a source of inspiration to understand how local rules at each neuron can make the whole network learn.


Alessandro RUDI

Kernel Methods in the quest for adaptivity in infinite-dimensional optimisation
with dense conic constraints: with application in non-convex optimisation, optimal transport and beyond

Many problems in applied mathematics can be written as infinite-dimensional optimisation problems with dense conic constraints. While linear constraints can be effectively tackled using existing numerical schemes, like the collocation method, leading to adaptive algorithms, this adaptivity does not extend to problems involving dense conic constraints. Non-convex optimization and optimal transport between densities are two examples where current methods fail to overcome the curse of dimensionality: despite lower bounds suggesting the potential for adaptive numerical algorithms, existing approaches suffer from worst-case complexities that prevent scalability to large-dimensional problems, even for regular instances.
We propose an extension of the collocation method that introduces adaptivity to infinite-dimensional optimization with dense conic constraints. The approach is based on a novel class of non-negative functions that enjoys desirable properties from both analytical and computational perspectives. We derive numerical algorithms for non-convex optimization and optimal transport between densities which exhibit computational complexities close to the lower bounds and that effectively address the curse of dimensionality for regular instances. Our research opens the path to adaptive scaling algorithms in large-dimensional instances across various classes of applied mathematics problems to address contemporary applications.


Maria STRAZZULLO

Reduced Order Methods for Parametric Optimal Control: an Overview and Diverse Applications

Optimal control, governed by parametric partial differential equations (PDEs), constitutes an elegant mathematical framework with wide-ranging applications across scientific and industrial domains. Its primary objective is to guide the system’s evolution toward a beneficial state. Despite its significance in applied research, the practical utility of optimal control is hindered by the computational complexity of the optimization problem.

The applicability of optimal control is limited due to prohibitively high computational costs associated with standard discretization techniques. This challenge becomes even more pronounced in the context of time-dependent and nonlinear parametric frameworks. To address these computational bottlenecks, the present work focuses on strategies employing model order reduction.

First, we introduce the framework and various algorithms that showcase how reduced order methods serve as an effective strategy for tackling optimal control problems. These methods are rigorously tested across a spectrum of PDEs, ranging from time-dependent to nonlinear equations in space-time formulations.

The second part of the presentation explores the broader implications of employing reduced control approaches in diverse scientific fields from an applied perspective. Specifically, we discuss their utility in addressing bifurcating phenomena, advancing geophysical models, and enhancing numerical stabilization techniques.

The talk aims to assess the potential of model order reduction in extending the practical reach of optimal control methodologies.


Aretha TECKENTRUP

Smoothed circulant embedding and applications in multilevel Monte Carlo methods

Parameters in mathematical models for physical processes are often impossible to determine fully or accurately, and are hence subject to uncertainty. By modelling the input parameters as stochastic processes, it is possible to quantify the uncertainty in the model outputs. In this talk, we employ the multilevel Monte Carlo (MLMC) method to compute expected values of quantities of interest related to partial differential equations with random coefficients. We make use of the circulant embedding method for sampling from the coefficient, and to further improve the computational complexity of the MLMC estimator, we devise and implement the smoothing technique integrated into the circulant embedding method. This allows to choose the coarsest mesh on the first level of MLMC independently of the correlation length of the covariance function of the random field, leading to considerable savings in computational cost.


Chiara TOMMASI

Optimal design of experiments for model discrimination

One of the criticisms usually made to the Theory of Optimal Design is that a particular model has to be assumed before designing the experiment, that is before having any data. Very often the experimenter has several possible models for the data to be collected. Then a model has to be chosen after a discrimination hypotheses test.

An optimality criterion to discriminate between two homoscedastic models for Normally distributed observations is T-optimality (Atkinson and Fedorov, 1975a). When the rival models are nested and they differ by s parameters, another commonly used criterion for model discrimination is the Ds-one (Atkinson, 1972). Both T- and Ds-criteria can be applied to specific types of models. Differently, the KL-criterion, proposed by López-Fidalgo, Tommasi and Trandafir (2007), is based on the popular Kullback-Leibler divergence and may be applied in a very general context. It can be used to discriminate between models which may be nested or separate, homoscedastic or heteroscedastic and with any distribution for the responses. Let us note that the KL-criterion coincides with the T-criterion when the observations are normally distributed, and it also includes the modifications of the T-criterion, which handle the heteroscedastic regression models (Ucinski and Bogacka, 2005) and the generalized linear models (Ponce de Leon and Atkinson, 1992). Some mathematical properties of the KL-criterion are studied in Aletti, May and Tommasi (2014) and recently, Lanteri et at. (2023) have pointed out the connection between the KL-optimality criterion and the log-likelihood test.

Since the KL-criterion depends on the unknown parameters of the assumed “true” model, a Bayesian generalization of the criterion has been also proposed by Tommasi and López-Fidalgo (2010). In addition, combining the KL-criterion with D-optimality, the double goal of model discrimination and parameter estimation can be attained; see Tommasi (2009) and May and Tommasi (2014).

Finally, to cope the case of discrimination among several models Tommasi, Martin-Martin and López-Fidalgo (2015) have suggested a max-min KL-efficiency criterion; see Atkinson and Fedorov (1975a) for a similar generalization of the T-criterion. To practically compute KL-optimal designs, recently, Chen, Chen, Hsu and Wong (2020) have proposed an hybrid algorithm based on particle swarm optimization. References:
Atkinson, A. C. (1972) Planning experiments to detect inadequate regression models. Biometrika,59 275-293. Atkinson, A. C. and Fedorov, V.V. (1975a) The designs of experiments for discriminating between two rival models. Biometrika 62 57-70.

Atkinson, A. C. and Fedorov, V.V. (1975b) Optimal design: experiments for discriminating between several models. Biometrika 62 289-303.

Chen, R.B., Chen, P.Y., Hsu, C.L. and Wong, W.K. (2020) Hybrid algorithms for generating optimal designs for discriminating multiple nonlinear models under various error distributional assumptions. PLOS ONE, 1-30.

Lanteri, A., Leorato, S., López-Fidalgo, J. and Tommasi, C. (2023). Designing to detect heteroscedasticity in a regression model. JRSS (B), 85(2), 315?326.

López-Fidalgo, J., Tommasi and C., Trandafir, P.C. (2007) An optimal experimental design criterion for discriminating between non-Normal models. JRSS (B) 69,2 231-242.

May C, Tommasi C (2013). Model selection and parameter estimation in non-linear nested models : a sequential generalized DKL-optimum design. Statistica Sinica, 24.

Ponce de Leon, A.C, Atkinson, A.C. (1992) The design of experiments to discriminate between two rival generalized linear models. Lecture Notes in Statistics – Advances in GLM and Satistical Modelling, Springer-Verlag.

Tommasi, C. (2009) Optimal designs for both model discrimination and parameter estimation. JSPI 139, 4123-4132. Tommasi, C., López-Fidalgo, J. (2010) Bayesian optimum designs for discriminating between models with any distribution. CSDA, 54, 143-150.

Tommasi, C., Martín-Martín, R. and López-Fidalgo, J. (2016) Max–min optimal discriminating designs for several statistical models. Stat Comput 26, 1163–1172.

Uciński, D. and Bogacka, B. (2005), T-optimum designs for discrimination between two multiresponse dynamic models. Journal RSS (B) 67: 3-18.