Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add: 7net-mf-ompa, 7net-omat #184

Merged
merged 14 commits into from
Mar 17, 2025
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
11 changes: 5 additions & 6 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,20 +3,19 @@ All notable changes to this project will be documented in this file.


## [0.11.0]

Multi-fidelity learning implemented & New pretrained-models

### Added
- Build multi-fidelity model, SevenNet-MF, based on given modality in the yaml
- Modality support for sevenn_inference, sevenn_get_modal, and SevenNetCalculator
- [cli] sevenn_cp tool for checkpoint summary, input generation, multi-modal routines
- sevenn_cp tool for checkpoint summary, input generation, multi-modal routines
- Modality append / assign using sevenn_cp
- Loss weighting for energy, force and stress for corresponding data label
- Ignore unlabelled data when calculating loss. (e.g. stress data for non-pbc structure)
- Dict style dataset input for multi-modal and data-weight
- (experimental) cuEquivariance support

### Added (code)
- sevenn.train.modal_dataset SevenNetMultiModalDataset
- sevenn.scripts.backward_compatibility.py
- sevenn.checkpoint.py
- Downloading large checkpoints from url (7net-MF-ompa, 7net-omat)
- D3 wB97M param

### Changed
Expand Down
68 changes: 52 additions & 16 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,10 +5,8 @@

SevenNet (Scalable EquiVariance Enabled Neural Network) is a graph neural network (GNN) interatomic potential package that supports parallel molecular dynamics simulations with [`LAMMPS`](https://lammps.org). Its underlying GNN model is based on [`NequIP`](https://github.com/mir-group/nequip).

> [!CAUTION]
> SevenNet+LAMMPS parallel after the commit id of `14851ef (v0.9.3 ~ 0.9.5)` has a serious bug.
> It gives wrong forces when the number of mpi processes is greater than two. The corresponding pip version is yanked for this reason. The bug is fixed for the main branch since `v0.10.0`, and pip (`v0.9.3.post0`).

> [!NOTE]
> We will soon release a CUDA-accelerated version of SevenNet, which will significantly increase the speed of our pre-trained models on [Matbench Discovery](https://matbench-discovery.materialsproject.org/).

## Features
- Pre-trained GNN interatomic potential and fine-tuning interface.
Expand All @@ -19,29 +17,66 @@ SevenNet (Scalable EquiVariance Enabled Neural Network) is a graph neural networ

## Pre-trained models
So far, we have released three pre-trained SevenNet models. Each model has various hyperparameters and training sets, resulting in different accuracy and speed. Please read the descriptions below carefully and choose the model that best suits your purpose.
We provide the training set MAEs (energy, force, and stress) F1 score for WBM dataset and $\kappa_{\mathrm{SRME}}$ from phonondb. For details on these metrics and performance comparisons with other pre-trained models, please visit [Matbench Discovery](https://matbench-discovery.materialsproject.org/).
We provide the training set MAEs (energy, force, and stress) F1 score, and RMSD for the WBM dataset, as well as $\kappa_{\mathrm{SRME}}$ from phonondb and CPS (Combined Performance Score). For details on these metrics and performance comparisons with other pre-trained models, please visit [Matbench Discovery](https://matbench-discovery.materialsproject.org/).

These models can be used as interatomic potential on LAMMPS, and also can be loaded through ASE calculator by calling the `keywords` of each model. Please refer [ASE calculator](#ase_calculator) to see the way to load a model through ASE calculator.
Additionally, `keywords` can be called in other parts of SevenNet, such as `sevenn_inference`, `sevenn_get_model`, and `checkpoint` in `input.yaml` for fine-tuning.

**Acknowledgments**: The models trained on [`MPtrj`](https://figshare.com/articles/dataset/Materials_Project_Trjectory_MPtrj_Dataset/23713842) were supported by the Neural Processing Research Center program of Samsung Advanced Institute of Technology, Samsung Electronics Co., Ltd. The computations for training models were carried out using the Samsung SSC-21 cluster.

---

### **SevenNet-MF-ompa (17Mar2025)**
> Model keywords: `7net-mf-ompa` | `SevenNet-mf-ompa`

**This is our recommended pre-trained model**

This model leverages [multi-fidelity learning](https://pubs.acs.org/doi/10.1021/jacs.4c14455) to simultaneously train on the [MPtrj](https://figshare.com/articles/dataset/Materials_Project_Trjectory_MPtrj_Dataset/23713842), [sAlex](https://huggingface.co/datasets/fairchem/OMAT24), and [OMat24](https://huggingface.co/datasets/fairchem/OMAT24) datasets. As of March 17, 2025, it has achieved state-of-the-art performance on the [Matbench Discovery](https://matbench-discovery.materialsproject.org/) in the CPS (Combined Performance Score). We have found that this model outperforms most tasks, except for isolated molecule energy, where it performs slightly worse than SevenNet-l3i5.

```python
from sevenn.calculator import SevenNetCalculator
# "mpa" refers to the MPtrj + sAlex modal, used for evaluating Matbench Discovery.
calc = SevenNetCalculator('7net-mf-ompa', modal='mpa') # Use modal='omat24' for OMat24-trained modal weights.
```
Theoretically, the `mpa` modal should produce PBE52 results, while the `omat24` modal yields PBE54 results.

When using the command-line interface of SevenNet, include the `--modal mpa` or `--modal omat24` option to select the desired modality.


#### **Matbench Discovery**
| CPS | F1 | $\kappa_{\mathrm{SRME}}$ | RMSD |
|:---:|:---:|:---:|:---:|
|**0.883**|**0.901**|0.317| **0.0115** |

[Detailed instructions for multi-fidelity](https://github.com/MDIL-SNU/SevenNet/blob/main/sevenn/pretrained_potentials/SevenNet_MF_0/README.md)

[Link to the full-information checkpoint](https://figshare.com/articles/software/7net_MF_ompa/28590722?file=53029859)

---
### **SevenNet-omat (17Mar2025)**
> Model keywords: `7net-omat` | `SevenNet-omat`

This model was trained solely on the [OMat24](https://huggingface.co/datasets/fairchem/OMAT24) dataset. It achieves state-of-the-art (SOTA) performance in $\kappa_{\mathrm{SRME}}$ on [Matbench Discovery](https://matbench-discovery.materialsproject.org/); however, the F1 score was not available due to a difference in the POTCAR version. Similar to `SevenNet-MF-ompa`, this model outperforms `SevenNet-l3i5` in most tasks, except for isolated molecule energy.

[Link to the full-information checkpoint](https://figshare.com/articles/software/SevenNet_omat/28593938).

#### **Matbench Discovery**
* $\kappa_{\mathrm{SRME}}$: **0.221**
---
### **SevenNet-l3i5 (12Dec2024)**
> Keywords in ASE: `7net-l3i5` and `SevenNet-l3i5`
> Model keywords: `7net-l3i5` | `SevenNet-l3i5`

The model increases the maximum spherical harmonic degree ($l_{\mathrm{max}}$) to 3, compared to `SevenNet-0` with $l_{\mathrm{max}}$ of 2. While **l3i5** offers improved accuracy across various systems compared to `SevenNet-0`, it is approximately four times slower. As of March 17, 2025, this model has achieved state-of-the-art (SOTA) performance on the CPS metric among compliant models, newly introduced in this [Matbench Discovery](https://matbench-discovery.materialsproject.org/).

The model increases the maximum spherical harmonic degree ($l_{\mathrm{max}}$) to 3, compared to **SevenNet-0 (11Jul2024)** with $l_{\mathrm{max}}$ of 2.
While **l3i5** offers improved accuracy across various systems compared to **SevenNet-0 (11Jul2024)**, it is approximately four times slower.
#### **Matbench Discovery**
| CPS | F1 | $\kappa_{\mathrm{SRME}}$ | RMSD |
|:---:|:---:|:---:|:---:|
|0.764 |0.76|0.55|0.0182|

* Training set MAE: 8.3 meV/atom (energy), 0.029 eV/Ang. (force), and 2.33 kbar (stress)
* Matbench F1 score: 0.76, $\kappa_{\mathrm{SRME}}$: 0.560
* Training time: 381 GPU-days on A100
---

### **SevenNet-0 (11Jul2024)**
> Keywords in ASE: `7net-0`, `SevenNet-0`, `7net-0_11Jul2024`, and `SevenNet-0_11Jul2024`
> Model keywords:: `7net-0` | `SevenNet-0` | `7net-0_11Jul2024` | `SevenNet-0_11Jul2024`

The model architecture is mainly line with [GNoME](https://github.com/google-deepmind/materials_discovery), a pretrained model that utilizes the NequIP architecture.
Five interaction blocks with node features that consist of 128 scalars (*l*=0), 64 vectors (*l*=1), and 32 tensors (*l*=2).
Expand All @@ -50,9 +85,11 @@ The model was trained with [MPtrj](https://figshare.com/articles/dataset/Materia
This model is loaded as the default pre-trained model in ASE calculator.
For more information, click [here](sevenn/pretrained_potentials/SevenNet_0__11Jul2024).

* Training set MAE: 11.5 meV/atom (energy), 0.041 eV/Ang. (force), and 2.78 kbar (stress)
* Matbench F1 score: 0.67, $\kappa_{\mathrm{SRME}}$: 0.767
* Training time: 90 GPU-days on A100
#### **Matbench Discovery**
| F1 | $\kappa_{\mathrm{SRME}}$ |
|:---:|:---:|
|0.67|0.767|

---

In addition to these latest models, you can find our legacy models from [pretrained_potentials](./sevenn/pretrained_potentials).
Expand Down Expand Up @@ -106,7 +143,6 @@ The model can be loaded through the following Python code.
from sevenn.calculator import SevenNetCalculator
calc = SevenNetCalculator(model='7net-0', device='cpu')
```

SevenNet supports CUDA accelerated D3Calculator.
```python
from sevenn.calculator import SevenNetD3Calculator
Expand Down
1 change: 1 addition & 0 deletions pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -28,6 +28,7 @@ dependencies = [
"numpy",
"matscipy",
"pandas",
"requests",
]
[project.optional-dependencies]
test = ["matscipy", "pytest-cov>=5"]
Expand Down
2 changes: 1 addition & 1 deletion setup.cfg
Original file line number Diff line number Diff line change
Expand Up @@ -10,5 +10,5 @@ include_trailing_comma=True
force_grid_wrap=0
use_parentheses=True
line_length=80
known_third_party=ase,braceexpand,e3nn,numpy,packaging,pandas,pytest,sklearn,torch,torch_geometric,tqdm,yaml
known_third_party=ase,braceexpand,e3nn,numpy,packaging,pandas,pytest,requests,sklearn,torch,torch_geometric,tqdm,yaml
known_first_party=
30 changes: 15 additions & 15 deletions sevenn/_const.py
Original file line number Diff line number Diff line change
Expand Up @@ -48,20 +48,18 @@
ACTIVATION_DICT = {'e': ACTIVATION_FOR_EVEN, 'o': ACTIVATION_FOR_ODD}

_prefix = os.path.abspath(f'{os.path.dirname(__file__)}/pretrained_potentials')
SEVENNET_0_11Jul2024 = (
f'{_prefix}/SevenNet_0__11Jul2024/checkpoint_sevennet_0.pth'
)
SEVENNET_0_22May2024 = (
f'{_prefix}/SevenNet_0__22May2024/checkpoint_sevennet_0.pth'
)
SEVENNET_l3i5 = (
f'{_prefix}/SevenNet_l3i5/checkpoint_l3i5.pth'
)
SEVENNET_MF_0 = (
f'{_prefix}/SevenNet_MF_0/checkpoint_sevennet_mf_0.pth'
)


SEVENNET_0_11Jul2024 = f'{_prefix}/SevenNet_0__11Jul2024/checkpoint_sevennet_0.pth'
SEVENNET_0_22May2024 = f'{_prefix}/SevenNet_0__22May2024/checkpoint_sevennet_0.pth'
SEVENNET_l3i5 = f'{_prefix}/SevenNet_l3i5/checkpoint_l3i5.pth'
SEVENNET_MF_0 = f'{_prefix}/SevenNet_MF_0/checkpoint_sevennet_mf_0.pth'
SEVENNET_MF_ompa = f'{_prefix}/SevenNet_MF_ompa/checkpoint_sevennet_mf_ompa.pth'
SEVENNET_omat = f'{_prefix}/SevenNet_omat/checkpoint_sevennet_omat.pth'

_git_prefix = 'https://github.com/MDIL-SNU/SevenNet/releases/download'
CHECKPOINT_DOWNLOAD_LINKS = {
SEVENNET_MF_ompa: f'{_git_prefix}/v0.11.0.cp/checkpoint_sevennet_mf_ompa.pth',
SEVENNET_omat: f'{_git_prefix}/v0.11.0.cp/checkpoint_sevennet_omat.pth',
}
# to avoid torch script to compile torch_geometry.data
AtomGraphDataType = Dict[str, torch.Tensor]

Expand Down Expand Up @@ -143,7 +141,9 @@ def error_record_condition(x):
},
KEY.CUTOFF: float,
KEY.NUM_CONVOLUTION: int,
KEY.CONV_DENOMINATOR: lambda x: isinstance(x, float) or x in [
KEY.CONV_DENOMINATOR: lambda x: isinstance(x, float)
or x
in [
'avg_num_neigh',
'sqrt_avg_num_neigh',
],
Expand Down
93 changes: 64 additions & 29 deletions sevenn/calculator.py
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@
import os
import pathlib
import warnings
from typing import Any, Optional, Union
from typing import Any, Dict, Optional, Union

import numpy as np
import torch
Expand All @@ -22,17 +22,12 @@


class SevenNetCalculator(Calculator):
"""ASE calculator for SevenNet models
"""Supporting properties:
'free_energy', 'energy', 'forces', 'stress', 'energies'
free_energy equals energy. 'energies' stores atomic energy.

Multi-GPU parallel MD is not supported for this mode.
Use LAMMPS for multi-GPU parallel MD.
This class is for convenience who want to run SevenNet models with ase.

Note than ASE calculator is designed to be interface of other programs.
But in this class, we simply run torch model inside ASE calculator.
So there is no FileIO things.

Here, free_energy = energy
Multi-GPU acceleration is not supported with ASE calculator.
You should use LAMMPS for the acceleration.
"""

def __init__(
Expand All @@ -42,14 +37,28 @@ def __init__(
device: Union[torch.device, str] = 'auto',
modal: Optional[str] = None,
enable_cueq: bool = False,
sevennet_config: Optional[Any] = None, # hold meta information
sevennet_config: Optional[Dict] = None, # Not used in logic, just meta info
**kwargs,
):
"""Initialize the calculator

Args:
model (SevenNet): path to the checkpoint file, or pretrained
device (str, optional): Torch device to use. Defaults to "auto".
"""Initialize SevenNetCalculator.

Parameters
----------
model: str | Path | AtomGraphSequential, default='7net-0'
Name of pretrained models (7net-mf-ompa, 7net-omat, 7net-l3i5, 7net-0) or
path to the checkpoint, deployed model or the model itself
file_type: str, default='checkpoint'
one of 'checkpoint' | 'torchscript' | 'model_instance'
device: str | torch.device, default='auto'
if not given, use CUDA if available
modal: str | None, default=None
modal (fidelity) if given model is multi-modal model. for 7net-mf-ompa,
it should be one of 'mpa' (MPtrj + sAlex) or 'omat24' (OMat24)
case insensitive
enable_cueq: bool, default=False
if True, use cuEquivariant to accelerate inference.
sevennet_config: dict | None, default=None
Not used, but can be used to carry meta information of this calculator
"""
super().__init__(**kwargs)
self.sevennet_config = None
Expand Down Expand Up @@ -131,18 +140,21 @@ def __init__(

self.model = model_loaded

if isinstance(self.model, AtomGraphSequential) and modal:
if self.model.modal_map is None:
raise ValueError('Modality given, but model has no modal_map')
if modal not in self.model.modal_map:
_modals = list(self.model.modal_map.keys())
raise ValueError(f'Unknown modal {modal} (not in {_modals})')
self.modal = None
if isinstance(self.model, AtomGraphSequential):
modal_map = self.model.modal_map
if modal_map:
modal_ava = list(modal_map.keys())
if not modal:
raise ValueError(f'modal argument missing (avail: {modal_ava})')
elif modal not in modal_ava:
raise ValueError(f'unknown modal {modal} (not in {modal_ava})')
self.modal = modal
elif not self.model.modal_map and modal:
warnings.warn(f'modal={modal} is ignored as model has no modal_map')

self.model.to(self.device)
self.model.eval()

self.modal = modal

self.implemented_properties = [
'free_energy',
'energy',
Expand Down Expand Up @@ -216,6 +228,31 @@ def __init__(
cn_cutoff: float = 1600, # au^2, 0.52917726 angstrom = 1 au
**kwargs,
):
"""Initialize SevenNetD3Calculator. CUDA required.

Parameters
----------
model: str | Path | AtomGraphSequential
Name of pretrained models (7net-mf-ompa, 7net-omat, 7net-l3i5, 7net-0) or
path to the checkpoint, deployed model or the model itself
file_type: str, default='checkpoint'
one of 'checkpoint' | 'torchscript' | 'model_instance'
device: str | torch.device, default='auto'
if not given, use CUDA if available
modal: str | None, default=None
modal (fidelity) if given model is multi-modal model. for 7net-mf-ompa,
it should be one of 'mpa' (MPtrj + sAlex) or 'omat24' (OMat24)
enable_cueq: bool, default=False
if True, use cuEquivariant to accelerate inference.
damping_type: str, default='damp_bj'
Damping type of D3, one of 'damp_bj' | 'damp_zero'
functional_name: str, default='pbe'
Target functional name of D3 parameters.
vdw_cutoff: float, default=9000
vdw cutoff of D3 calculator in au
cn_cutoff: float, default=1600
cn cutoff of D3 calculator in au
"""
d3_calc = D3Calculator(
damping_type=damping_type,
functional_name=functional_name,
Expand Down Expand Up @@ -267,9 +304,7 @@ def _load(name: str) -> ctypes.CDLL:

load(
name=name,
sources=[
os.path.join(package_dir, 'pair_e3gnn', 'pair_d3_for_ase.cu')
],
sources=[os.path.join(package_dir, 'pair_e3gnn', 'pair_d3_for_ase.cu')],
extra_cuda_cflags=['-O3', '--expt-relaxed-constexpr', '-fmad=false'],
build_directory=compile_dir,
verbose=True,
Expand Down
Loading