Results – SEACS

Selected results :

A final report on the SACS project can be found at: Final Report

Model-driven approaches for stochastic representations of geophysical systems/dynamics

Resseguier et al. Geophysical flows under location uncertainty, Part I Random transport and general models. GAFD 2017 (link). A stochastic flow representation is considered with the Eulerian velocity decomposed between a smooth large scale component and a rough small-scale turbulent component. The latter is specified as a random field uncorrelated in time. Subsequently, the material derivative is modified and leads to a stochastic version of the material derivative to include a drift correction, an inhomogeneous and anisotropic diffusion, and a multiplicative noise. As derived, this stochastic transport exhibits a remarkable energy conservation property for any realizations. As demonstrated, this pivotal operator further provides elegant means to derive stochastic formulations of classical representations of geophysical flow dynamics.

Chapron et al. Large scale flows under location uncertainty: a consistent stochastic framework. QJRMS 2018. Using a classical example, the Lorenz-63 model, an original stochastic framework is applied to represent large-scale geophysical flow dynamics. Rigorously derived from a reformulated material derivative, the proposed framework encompasses several meaningful mechanisms to model geophysical flows. The slightly compressible setup , as treated in the Boussinesq approximation, brings up a stochastic transport equation for the density and other related thermo-dynamical variables. Coupled to the momentum equation through a forcing term, a resulting stochastic Lorenz-63 model is consistently derived. Based on such a reformulated model, the pertinence of this large-scale stochastic approach is demonstrated over classical eddy-viscosity based large-scale representations.

Data-driven approaches for stochastic representations of geophysical systems/dynamics

Lguensat et al. The Analog Data Assimilation. Monthly Weather Review, 2017 (link). In light of growing interest in data-driven methods for oceanic, atmospheric, and climate sciences, this work focuses on the field of data assimilation and presents the analog data assimilation (AnDA). The proposed framework produces a reconstruction of the system dynamics in a fully data-driven manner where no explicit knowledge of the dynamical model is required. Instead, a representative catalog of trajectories of the system is assumed to be available. Based on this catalog, the analog data assimilation combines the nonparametric sampling of the dynamics using analog forecasting methods with ensemble-based assimilation techniques. This study explores different analog forecasting strategies and derives both ensemble Kalman and particle filtering versions of the proposed analog data assimilation approach. Numerical experiments are examined for two chaotic dynamical systems: the Lorenz-63 and Lorenz-96 systems. The performance of the analog data assimilation is discussed with respect to classical model-driven assimilation. A Matlab toolbox and Python library of the AnDA are provided to help further research building upon the present findings.

Rousseau et al. Residual Networks as Flows of Diffeomorphisms. JMIV 2019 (link). This paper addresses the understanding and characterization of residual networks (ResNet), which are among the state-of-the-art deep learning architectures for a variety of supervised learning problems. We focus on the mapping component of ResNets, which map the embedding space toward a new unknown space where the prediction or classification can be stated according to linear criteria. We show that this mapping component can be regarded as the numerical implementation of continuous flows of diffeomorphisms governed by ordinary differential equations. In particular, ResNets with shared weights are fully characterized as numerical approximation of exponential diffeomorphic operators. We stress both theoretically and numerically the relevance of the enforcement of diffeomorphic properties and the importance of numerical issues to make consistent the continuous formulation and the discretized ResNet implementation. We further discuss the resulting theoretical and computational insights into ResNet architectures.

Ouala et al. Learning Latent Dynamics for Partially-Observed Chaotic Systems. Preprint 2019 (link). This paper addresses the data-driven identification of latent dynamical representations of partially-observed systems, i.e. dynamical systems for which some components are never observed, with an emphasis on forecasting applications, including long-term asymptotic patterns. Whereas state-of-the-art data-driven approaches rely on delay embeddings and linear decompositions of the underlying operators, we introduce a framework based on the data-driven identification of an augmented state-space model using a neural-network-based representation. For a given training dataset, it amounts to jointly learn an ODE (Ordinary Differential Equation) representation in the latent space and reconstructing latent states. Through numerical experiments, we demonstrate the relevance of the proposed framework w.r.t. state-of-the-art approaches in terms of short-term forecasting performance and long-term behaviour. We further discuss how the proposed framework relates to Koopman operator theory and Takens’ embedding theorem.

Ouala et al. Neural Network Based Kalman Filters for the Spatio-Temporal Interpolation of Satellite-Derived Sea Surface Temperature. Remote Sensing, 2018 (link). The forecasting and reconstruction of oceanic dynamics is a crucial challenge. While model driven strategies are still the state-of-the-art approaches in the reconstruction of spatio-temporal dynamics. The ever increasing availability of data collections in oceanography raised the relevance of data-driven approaches as computationally efficient representations of spatio-temporal fields reconstruction. This tools proved to outperform classical state-of-the-art interpolation techniques such as optimal interpolation and DINEOF in the retrievement of fine scale structures while still been computationally efficient comparing to model based data assimilation schemes. However, coupling this data-driven priors to classical filtering schemes limits their potential representativity. From this point of view, the recent advances in machine learning and especially neural networks and deep learning can provide a new infrastructure for dynamical modeling and interpolation within a data-driven framework. In this work we adress this challenge and develop a novel Neural-Network-based (NN-based) Kalman filter for spatio-temporal interpolation of sea surface dynamics. Based on a data-driven probabilistic representation of spatio-temporal fields, our approach can be regarded as an alternative to classical filtering schemes such as the ensemble Kalman filters (EnKF) in data assimilation. Overall, the key features of the proposed approach are two-fold: (i) we propose a novel architecture for the stochastic representation of two dimensional (2D) geophysical dynamics based on a neural networks, (ii) we derive the associated parametric Kalman-like filtering scheme for a computationally-efficient spatio-temporal interpolation of Sea Surface Temperature (SST) fields. We illustrate the relevance of our contribution for an OSSE (Observing System Simulation Experiment) in a case-study region off South Africa. Our numerical experiments report significant improvements in terms of reconstruction performance compared with operational and state-of-the-art schemes (e.g., optimal interpolation, Empirical Orthogonal Function (EOF) based interpolation and analog data assimilation).

Fablet et al. End-to-end learning of energy-based representations for irregularly-sampled signals and images. Preprint 2019 (link). For numerous domains, including for instance earth observation, medical imaging, astrophysics,…, available image and signal datasets often involve irregular space-time sampling patterns and large missing data rates. These sampling properties may be critical to apply state-of-the-art learning-based (e.g., auto-encoders, CNNs,…), fully benefit from the available large-scale observations and reach breakthroughs in the reconstruction and identification of processes of interest. In this paper, we address the end-to-end learning of representations of signals, images and image sequences from irregularly-sampled data, {\em i.e.} when the training data involved missing data. From an analogy to Bayesian formulation, we consider energy-based representations. Two energy forms are investigated: one derived from auto-encoders and one relating to Gibbs priors. The learning stage of these energy-based representations (or priors) involve a joint interpolation issue, which amounts to solving an energy minimization problem under observation constraints. Using a neural-network-based implementation of the considered energy forms, we can state an end-to-end learning scheme from irregularly-sampled data. We demonstrate the relevance of the proposed representations for different case-studies: namely, multivariate time series, 2{\sc } images and image sequences.

Application to upper oceasystems/dynamics

Fablet et al. Data-driven Models for the Spatio-Temporal Interpolation of satellite-derived SST Fields. IEEE TCI 2017 (link). Satellite-derived products are of key importance for the high-resolution monitoring of the ocean surface at a global scale. Due to the sensitivity of spaceborne sensors to the atmospheric conditions as well as the associated spatio-temporal sampling, ocean remote sensing data may involve high-missing data rate. The spatio-temporal interpolation of these data remains a key challenge to deliver L4 gridded products to end-users. Whereas operational products mostly rely on model-driven approaches, especially optimal interpolation based on Gaussian process priors, the availability of large-scale observation and simulation datasets advocate for the development of novel data-driven models. This study investigates such models. We extend the recently introduced analog data assimilation to high-dimensional spatio-temporal fields using a multi-scale patch-based decomposition. Using an Observing System Simulation Expriment (OSSE) for sea surface temperature, we demonstrate the relevance of the proposed data-driven scheme for the real missing data patterns of the high-resolution infrared METOP sensor. It resorts to a significant improvement w.r.t. state-of-the-art techniques in terms of interpolation error (about 50 % of relative gain) and spectral characteristics for horizontal scales smaller than 100km. We further discuss the key features and parameterizations of the proposed data-driven approach as well as its relevance with respect to classical interpolation techniques.

Lopez Radcenco et al., Analog Data Assimilation for Along-track Nadir and SWOT Altimetry Observations in the Western Mediterranean Sea. IEEE JSTARS, 2019 (link). Current generation satellite altimetry missions have played a fundamental role in improving our understanding of sea surface dynamics, despite only being able to provide measurements along the satellite track.In this respect, the future SWOT altimetry mission will be the first mission to produce complete two-dimensional wide-swath satellite observations. With a view towards the upcoming SWOT mission launch, we explore the potential of SWOT observations to improve the reconstruction of high-resolution sea level anomaly (SLA) fields from satellite-derived data. Given the ever-increasing availability of multi-source datasets that supports the exploration of data-driven alternatives to classical model-driven formulations, we focus here on recently introduced data-driven models for the interpolation of geophysical fields. Using an Observing System Simulation Experiment (OSSE), we demonstrate the relevance of SWOT observations to better constraint data-driven interpolation models in order to improve the reconstruction of mesoscale features. Reported results suggest that SWOT observations can provide more information than currently available nadir along-track altimetry observations and show an additional SLA reconstruction performance improvement when the joint assimilation of SWOT and nadir along-track observations is considered.

Lguensat et al. EddyNet. IEEE IGARS 2018 (link). Convolutional Neural Networks (CNNs) have been recently attracting a considerable attention for remote sensing applications. In this paper, we explore CNN-based models and strategies for ocean remote sensing data, and more precisely the detection and classification of oceanic eddies from satellite-derived Sea Surface Height (SSH) maps. We develop several architectures inspired from classical works such as U-Net, ResNet and V-Net and assess their performance for a real dataset. From a methodological point of view, we also embrace ideas from the Level Set (LS) method, a traditional image segmentation method, and design a LS-CNN technique that combines CNNs and level-sets. Experiments on the Southern Atlantic Ocean show the superiority of the proposed LS-CNNs in comparison to other CNN architectures. We make available the datasets and codes used for this work to foster more research in this direction.

You can find ai4oceandyn results here.