GitLocker: The Coding Marketplace

Description:

lagrangebench 0.2.0

NeurIPS page with video and slides here.
Table of Contents

Installation
Usage
Datasets
Pretrained Models
Directory Structure
Contributing
Citation

Installation
Standalone library
Install the core lagrangebench library from PyPi as
python3.10 -m venv venv
source venv/bin/activate
pip install lagrangebench --extra-index-url=https://download.pytorch.org/whl/cpu

Note that by default lagrangebench is installed without JAX GPU support. For that follow the instructions in the GPU support section.
Clone
Clone this GitHub repository
git clone https://github.com/tumaer/lagrangebench.git
cd lagrangebench

Install the dependencies with Poetry (>=1.6.0)
poetry install --only main

Alternatively, a requirements file is provided. It directly installs the CUDA version of JAX.
pip install -r requirements_cuda.txt

For a CPU version of the requirements file, one could use docs/requirements.txt.
GPU support
To run JAX on GPU, follow Installing JAX, or in general run
pip install -U "jax[cuda12]==0.4.29"

Note: as of 27.06.2024, to make our GNN models deterministic on GPUs, you need to set os.environ["XLA_FLAGS"] = "--xla_gpu_deterministic_ops=true". However, all current models rely of scatter_sum, and this operation seems to be slower than running a normal for-loop in Python, when executed in deterministic mode, see #17844 and #10674.

MacOS
Currently, only the CPU installation works. You will need to change a few small things to get it going:

Clone installation: in pyproject.toml change the torch version from 2.1.0+cpu to 2.1.0. Then, remove the poetry.lock file and run poetry install --only main.
Configs: You will need to set dtype=float32 and train.num_workers=0.

Although the current jax-metal==0.0.5 library supports jax in general, there seems to be a missing feature used by jax-md related to padding -> see this issue.
Usage
Standalone benchmark library
A general tutorial is provided in the example notebook "Training GNS on the 2D Taylor Green Vortex" under ./notebooks/tutorial.ipynb on the LagrangeBench repository. The notebook covers the basics of LagrangeBench, such as loading a dataset, setting up a case, training a model from scratch and evaluating its performance.
Running in a local clone (main.py)
Alternatively, experiments can also be set up with main.py, based on extensive YAML config files and cli arguments (check configs/). By default, the arguments have priority as 1) passed cli arguments, 2) YAML config and 3) defaults.py (lagrangebench defaults).
When loading a saved model with load_ckp the config from the checkpoint is automatically loaded and training is restarted. For more details check the runner.py file.
Train
For example, to start a GNS run from scratch on the RPF 2D dataset use
python main.py config=configs/rpf_2d/gns.yaml

Some model presets can be found in ./configs/.
If mode=all is provided, then training (mode=train) and subsequent inference (mode=infer) on the test split will be run in one go.
Restart training
To restart training from the last checkpoint in load_ckp use
python main.py load_ckp=ckp/gns_rpf2d_yyyymmdd-hhmmss

Inference
To evaluate a trained model from load_ckp on the test split (test=True) use
python main.py load_ckp=ckp/gns_rpf2d_yyyymmdd-hhmmss/best rollout_dir=rollout/gns_rpf2d_yyyymmdd-hhmmss/best mode=infer test=True

If the default eval.infer.out_type=pkl is active, then the generated trajectories and a metricsYYYY_MM_DD_HH_MM_SS.pkl file will be written to eval.rollout_dir. The metrics file contains all eval.infer.metrics properties for each generated rollout.
Notebooks
We provide three notebooks that show LagrangeBench functionalities, namely:

tutorial.ipynb , with a general overview of LagrangeBench library, with training and evaluation of a simple GNS model,
datasets.ipynb , with more details and visualizations of the datasets, and
gns_data.ipynb , showing how to train models within LagrangeBench on the datasets from the paper Learning to Simulate Complex Physics with Graph Networks.

Datasets
The datasets are hosted on Zenodo under the DOI: 10.5281/zenodo.10021925. If a dataset is not found in dataset.src, the data is automatically downloaded. Alternatively, to manually download the datasets use the download_data.sh shell script, either with a specific dataset name or "all". Namely

Taylor Green Vortex 2D: bash download_data.sh tgv_2d datasets/
Reverse Poiseuille Flow 2D: bash download_data.sh rpf_2d datasets/
Lid Driven Cavity 2D: bash download_data.sh ldc_2d datasets/
Dam break 2D: bash download_data.sh dam_2d datasets/
Taylor Green Vortex 3D: bash download_data.sh tgv_3d datasets/
Reverse Poiseuille Flow 3D: bash download_data.sh rpf_3d datasets/
Lid Driven Cavity 3D: bash download_data.sh ldc_3d datasets/
All: bash download_data.sh all datasets/

Pretrained Models
We provide pretrained model weights of our default GNS and SEGNN models on each of the 7 LagrangeBench datasets. You can download and run the checkpoints given below. In the table, we also provide the 20-step error measures on the full test split.

Dataset
Model
MSE20
Sinkhorn
MSEEkin

2D TGV
GNS-10-128
5.9e-6
3.2e-7
4.9e-7

SEGNN-10-64
4.4e-6
2.1e-7
5.0e-7

2D RPF
GNS-10-128
4.0e-6
2.5e-7
2.7e-5

SEGNN-10-64
3.4e-6
2.5e-7
1.4e-5

2D LDC
GNS-10-128
1.5e-5
1.1e-6
6.1e-7

SEGNN-10-64
2.1e-5
3.7e-6
1.6e-5

2D DAM
GNS-10-128
3.1e-5
1.4e-5
1.1e-4

SEGNN-10-64
4.1e-5
2.3e-5
5.2e-4

3D TGV
GNS-10-128
5.8e-3
4.7e-6
4.8e-2

SEGNN-10-64
5.0e-3
4.9e-6
3.9e-2

3D RPF
GNS-10-128
2.1e-5
3.3e-7
1.8e-6

SEGNN-10-64
1.7e-5
2.7e-7
1.7e-6

3D LDC
GNS-10-128
4.1e-5
3.2e-7
1.9e-8

SEGNN-10-64
4.1e-5
2.9e-7
2.5e-8

To reproduce the numbers in the table, e.g., on 2D TGV with GNS, follow these steps:
# download the checkpoint (1) through the browser or
# (2) using the file ID from the URL, i.e., for 2D TGV + GNS
gdown 19TO4PaFGcryXOFFKs93IniuPZKEcaJ37
# unzip the downloaded file `gns_tgv2d.zip`
python -c "import shutil; shutil.unpack_archive('gns_tgv2d.zip', 'gns_tgv2d')"
# evaluate the model on the test split
python main.py gpu=$GPU_ID mode=infer eval.test=True load_ckp=gns_tgv2d/best

Directory structure
📦lagrangebench
┣ 📂case_setup # Case setup manager
┃ ┣ 📜case.py # CaseSetupFn class
┃ ┗ 📜features.py # Feature extraction
┣ 📂data # Datasets and dataloading utils
┃ ┣ 📜data.py # H5Dataset class and specific datasets
┃ ┗ 📜utils.py
┣ 📂evaluate # Evaluation and rollout generation tools
┃ ┣ 📜metrics.py
┃ ┣ 📜rollout.py
┃ ┗ 📜utils.py
┣ 📂models # Baseline models
┃ ┣ 📜base.py # BaseModel class
┃ ┣ 📜egnn.py
┃ ┣ 📜gns.py
┃ ┣ 📜linear.py
┃ ┣ 📜painn.py
┃ ┣ 📜segnn.py
┃ ┗ 📜utils.py
┣ 📂train # Trainer method and training tricks
┃ ┣ 📜strats.py # Training tricks
┃ ┗ 📜trainer.py # Trainer method
┣ 📜defaults.py # Default values
┣ 📜runner.py # Runner wrapping training and inference
┗ 📜utils.py

Contributing
Welcome! We highly appreciate Github issues and PRs.
You can also chat with us on Discord.
Contributing Guideline
If you want to contribute to this repository, you will need the dev dependencies, i.e.
install the environment with poetry install without the --only main flag.
Then, we also recommend you install the pre-commit hooks
if you don't want to manually run pre-commit run before each commit. To sum up:
git clone https://github.com/tumaer/lagrangebench.git
cd lagrangebench
poetry install
source $PATH_TO_LAGRANGEBENCH_VENV/bin/activate

# install pre-commit hooks defined in .pre-commit-config.yaml
# ruff is configured in pyproject.toml
pre-commit install

# if you want to bump the version in both pyproject.toml and __init__.py, do
poetry self add poetry-bumpversion
poetry version patch # or minor/major

After you have run git add <FILE> and try to git commit, the pre-commit hook will
fix the linting and formatting of <FILE> before you are allowed to commit.
You should also run the tests locally before creating a PR. Do this simply by:
# pytest is configured in pyproject.toml
pytest

Clone vs Library
LagrangeBench can be installed by cloning the repository or as a standalone library. This offers more flexibility, but it also comes with its disadvantages: the necessity to implement some things twice. If you change any of the following things, make sure to update its counterpart as well:

General setup in lagrangebench/runner.py and notebooks/tutorial.ipynb
Configs in configs/ and lagrangebench/defaults.py
Zenodo URLs in download_data.sh and lagrangebench/data/data.py
Dependencies in pyproject.toml, requirements_cuda.txt, and docs/requirements.txt
Library version in pyproject.toml and lagrangebench/__init__.py

Citation
The paper (at NeurIPS 2023 Datasets and Benchmarks) can be cited as:
@article{toshev2024lagrangebench,
title={Lagrangebench: A lagrangian fluid mechanics benchmarking suite},
author={Toshev, Artur and Galletti, Gianluca and Fritz, Fabian and Adami, Stefan and Adams, Nikolaus},
journal={Advances in Neural Information Processing Systems},
volume={36},
year={2024}
}

The associated datasets can be cited as:
@dataset{toshev_2024_10491868,
author = {Toshev, Artur P. and Adams, Nikolaus A.},
title = {LagrangeBench Datasets},
month = jan,
year = 2024,
publisher = {Zenodo},
doi = {10.5281/zenodo.10491868},
url = {https://doi.org/10.5281/zenodo.10491868}
}

Publications
The following further publications are based on the LagrangeBench codebase:

Learning Lagrangian Fluid Mechanics with E(3)-Equivariant Graph Neural Networks (GSI 2023), A. P. Toshev, G. Galletti, J. Brandstetter, S. Adami, N. A. Adams
Neural SPH: Improved Neural Modeling of Lagrangian Fluid Dynamics (ICML 2024), A. P. Toshev, J. A. Erbesdobler, N. A. Adams, J. Brandstetter