- take a vector with arbitrary spatial coordinates of variable size (a set of query points) as input,
- exploit the dependencies of the input coordinates when generating the outputs,
- generate all outputs for the inputs in one forward pass.
Transformer models are increasingly used for solving Partial Differential Equations (PDEs). Several adaptations have been proposed, all of which suffer from the typical problems of Transformers, such as quadratic memory and time complexity. Furthermore, all prevalent architectures for PDE solving lack at least one of several desirable properties of an ideal surrogate model, such as (i) generalization to PDE parameters not seen during training, (ii) spatial and temporal zero-shot super-resolution, (iii) continuous temporal extrapolation, (iv) support for 1D, 2D, and 3D PDEs, and (v) efficient inference for longer temporal rollouts. To address these limitations, we propose Vectorized Conditional Neural Fields (VCNeFs), which represent the solution of time-dependent PDEs as neural fields. Contrary to prior methods, however, VCNeFs compute, for a set of multiple spatio-temporal query points, their solutions in parallel and model their dependencies through attention mechanisms. Moreover, VCNeF can condition the neural field on both the initial conditions and the parameters of the PDEs. An extensive set of experiments demonstrates that VCNeFs are competitive with and often outperform existing ML-based surrogate models.
\[ \partial_t u(t, \boldsymbol{x}) + u(t, \boldsymbol{x}) \partial_x u(t, \boldsymbol{x}) = \frac{\nu}{\pi} \partial_{xx} u(t, \boldsymbol{x}) \]
where \( \boldsymbol{x} \) represents the spatial coordinate and \( t \) the temporal coordinate. The solution of the PDE is the function \( u\) which is sought or learned by the neural network.Neural Fields. In physics, a field is a quantity that is defined for all spatial and temporal coordinates. Neural Fields (NeFs) learn a function \( f \) which maps the spatial and temporal coordinates (i.e., \(\boldsymbol{x} \in \mathbb{R}^d \), \(t \in \mathbb{R}^+\) respectively) to a quantity \(q \in \mathbb{R}^c \). Mathematically, a neural field can be expressed as a function \[ f_\theta: (\mathbb{R}^+ \times \mathbb{R}^d) \rightarrow \mathbb{R}^c \text{ with } (t, \boldsymbol{x}) \mapsto q = u(t, \boldsymbol{x})\] that is parametrized by a neural network with parameters \( \theta \).
Conditional Neural Fields. Conditional Neural Fields (CNeFs) extend NeFs with a conditioning factor \( \boldsymbol{z} \in \mathbb{R}^n \) to influence the output of the neural field. The conditioning factor was originally introduced for computer vision to control the colors or shapes of objects that are being modeled. In contrast, we condition the neural field, which models the solution of the PDE, on the initial value or IC and the PDE parameters. Thus, the conditioning factor influences the entire field. This leads to the function \[ f_\theta: (\mathbb{R}^+ \times \mathbb{R}^d \times \mathbb{R}^n ) \rightarrow \mathbb{R}^c \text{ with } (t, \boldsymbol{x}; \boldsymbol{z}) \mapsto q = u(t, \boldsymbol{x}; \boldsymbol{z}) \] that is parametrized by a neural network with parameters \( \theta \) and \( \boldsymbol{z} \) influences the modeled function \( u \).
Neural Field
Conditional Neural Field
Vectorized Conditional Neural Fields. Typically, a (conditional) neural field generates the output quantities for all input spatial and temporal coordinates in multiple and independent forward passes. The training and inference times can be improved by processing multiple inputs in parallel on the GPU, which is possible since all forward passes are independent. However, there are spatial dependencies between different input spatial coordinates, particularly for solving PDEs, that will not be exploited with CNeFs or by processing multiple inputs of CNeFs in parallel. Consequently, we propose extending CNeFs to
Hence, we name our proposed model Vectorized Conditional Neural Field since it implicitly generates a vectorization of the input spatial coordinates for a given time \( t \). The VCNeF model represents a function \[ f_\theta: (\mathbb{R}_{+} \times \mathbb{R}^{s \times d}) \rightarrow \mathbb{R}^{s \times c} \\ \text{ with } (t, \boldsymbol{X}) \mapsto u(t, \boldsymbol{X}) = \begin{pmatrix} u(t, \boldsymbol{x_1}) \\ \vdots \\ u(t, \boldsymbol{x_s}) \end{pmatrix} \] where \( u(t, \boldsymbol{x_i}) \) denotes the PDE solution for the spatial coordinates \( \boldsymbol{x_i} \). Note that we do not impose a structure on the spatial coordinates \( \boldsymbol{x_i} \) and that the number of spatial points (i.e., \( s \) ) can be arbitrary. The model can process multiple timesteps \( t \) in parallel on the GPU to further improve the training and inference time since VCNeF does not exploit dependencies between the temporal coordinates.
VCNeF Properties. The design of the proposed VCNeF model allows for the following properties. (i) Generalization to PDE parameters not seen during training, (ii) spatial- and temporal zero-shot super-resolution (i.e., increasing the spatial and temporal resolution after training), (iii) accelerated training and inference due to the vectorization, and (iv) allows including a physics-aware loss function.
Neural Architecture. VCNeF consists of a Linear Transformer to produce an attention-refined representation of the initial condition and PDE parameters. In the modulation blocks, the previously computed attention-refined latent-representations are used to modulate a vectorized conditional neural field. The vectorized conditional neural field can be queried with temporal and spatial coordinates and uses a self-attention mechanism that allows leveraging spatial dependencies between the generated solution points. To reduce the computational costs, the spatial domains are divided into non-overlapping patches as in Vision Transformers (ViTs) for 2D and 3D PDEs. However, unlike a traditional ViT, our patch generation has two branches: patches of a smaller size (\(p_S = 4 \) or \(4 \times 4 \)) and of a larger size (\(p_S = 16 \) or \(16 \times 16 \)) since we aim to capture the dynamics accurately at multiple scales.
In our experiments, we focus on solving initial value problems for 1D Burgers', 1D Advection, and Compressible Navier-Stokes (CNS) equations from PDEBench. We choose neural operators, neural-field-based, and transformer-based models as baselines. Namely, we choose the Fourier Neural Operator (FNO), MP-PDE, CORAL, OFormer, and Galerkin Transformer as baselines. Except for CORAL, all models are trained in an autoregressive fashion.
Comparison to state-of-the-art Baselines for Fixed PDE Parameter Value and Resolutions
PDE | Model | nRMSE (↓) | bRMSE (↓) |
1D Burgers' | FNO | 0.0987 | 0.0225 |
MP-PDE | 0.3046 (+208.7%) | 0.0725 (+221.7%) | |
CORAL | 0.2221 (+125.1%) | 0.0515 (+128.2%) | |
OFormer | 0.1035 (+4.9%) | 0.0215 (-4.5%) | |
Galerkin Transformer | 0.1651 (+67.3%) | 0.0366 (+62.3%) | |
VCNeF | 0.0824 (-16.5%) | 0.0228 (+1.3%) | |
1D Advection | FNO | 0.0190 | 0.0239 |
MP-PDE | 0.0195 (+2.7%) | 0.0283 (+18.4%) | |
CORAL | 0.0198 (+4.3%) | 0.0127 (-46.8%) | |
OFormer | 0.0118 (-38.0%) | 0.0073 (-69.6%) | |
Galerkin Transformer | 0.0621 (+227.1%) | 0.0349 (+46.2%) | |
VCNeF | 0.0165 (-13.0%) | 0.0088 (-63.2%) | |
1D Compressible Navier-Stokes | FNO | 0.5722 | 1.9797 |
CORAL | 0.5993 (+4.7%) | 1.5908 (-19.6%) | |
OFormer | 0.4415 (-22.9%) | 2.0478 (+3.4%) | |
Galerkin Transformer | 0.7019 (+22.7%) | 3.0143 (+52.3%) | |
VCNeF | 0.2943 (-48.6%) | 1.3496 (-31.8%) | |
2D Compressible Navier-Stokes | FNO | 0.5625 | 0.2332 |
Galerkin Transformer | 0.6702 (+19.2%) | 0.8219 (+252.4%) | |
VCNeF | 0.1994 (-64.6%) | 0.0904 (-61.2%) | |
3D Compressible Navier-Stokes | FNO | 0.8138 | 6.0407 |
VCNeF | 0.7086 (-12.9%) | 4.8922 (-19.0%) | |
Spatial Zero-Shot Super-Resolution
Trained on lower spatial resolutions and tested on higher spatial resolutions
PDE | Spatial res. | Model | nRMSE (↓) | bRMSE (↓) |
1D CNS | 256 | FNO | 0.5722 | 1.9797 |
OFormer | 0.4415 | 2.0478 | ||
VCNeF | 0.2943 | 1.3496 | ||
512 | FNO | 0.6610 | 2.7683 | |
OFormer | 0.4657 | 2.5618 | ||
VCNeF | 0.2943 | 1.3502 | ||
1024 | FNO | 0.7320 | 3.5258 | |
OFormer | 0.4655 | 2.5526 | ||
VCNeF | 0.2943 | 1.3510 | ||
3D Compressible Navier-Stokes | 32 x 32 x 32 | FNO | 0.8138 | 6.0407 |
VCNeF | 0.7086 | 4.8922 | ||
64 x 64 x 64 | FNO | 0.9452 | 8.7068 | |
VCNeF | 0.7228 | 5.1495 | ||
128 x 128 x 128 | FNO | 1.0077 | 9.8633 | |
VCNeF | 0.7270 | 5.3208 | ||
Temporal Zero-Shot Super-Resolution
Trained on lower temporal resolutions and tested on higher temporal resolutions
PDE | Temporal res. | Model | nRMSE (↓) | bRMSE (↓) |
1D CNS | 41 | FNO | 0.5722 | 1.9797 |
CORAL | 0.5993 | 1.5908 | ||
VCNeF | 0.2943 | 1.3496 | ||
82 | FNO + Interp. | 0.5667 | 1.9639 | |
CORAL | 1.1524 | 3.7960 | ||
VCNeF | 0.2965 | 1.3741 | ||
3D CNS | 11 | FNO | 0.8138 | 6.0407 |
VCNeF | 0.7086 | 4.8922 | ||
21 | FNO + Interp. | 0.8099 | 6.1938 | |
VCNeF | 0.7106 | 5.1446 | ||
Generalization to Unseen PDE Parameter Values
Trained on set of PDE parameter values and tested on unseen values (boldfaced)
The experiments demonstrate that VCNeF performs competitively with the baselines and often outperforms them. Furthermore, the model has proper zero-shot super-resolution capabilities in space and time as well as can generalize to unseen PDE parameter values.
@inproceedings{hagnberger2024vecnef,
title={Vectorized Conditional Neural Fields: A Framework for Solving Time-dependent Parametric Partial Differential Equations},
author={Jan Hagnberger and Marimuthu Kalimuthu and Daniel Musekamp and Mathias Niepert},
year={2024},
eprint={2406.03919},
archivePrefix={arXiv},
primaryClass={cs.LG}
}