Skip to content

Commit 245920c

Browse files
authored
Merge pull request #184 from MDIL-SNU/checkpoint
Add: 7net-mf-ompa, 7net-omat
2 parents e3156f3 + abb088b commit 245920c

File tree

8 files changed

+349
-76
lines changed

8 files changed

+349
-76
lines changed

CHANGELOG.md

+5-6
Original file line numberDiff line numberDiff line change
@@ -3,20 +3,19 @@ All notable changes to this project will be documented in this file.
33

44

55
## [0.11.0]
6+
7+
Multi-fidelity learning implemented & New pretrained-models
8+
69
### Added
710
- Build multi-fidelity model, SevenNet-MF, based on given modality in the yaml
811
- Modality support for sevenn_inference, sevenn_get_modal, and SevenNetCalculator
9-
- [cli] sevenn_cp tool for checkpoint summary, input generation, multi-modal routines
12+
- sevenn_cp tool for checkpoint summary, input generation, multi-modal routines
1013
- Modality append / assign using sevenn_cp
1114
- Loss weighting for energy, force and stress for corresponding data label
1215
- Ignore unlabelled data when calculating loss. (e.g. stress data for non-pbc structure)
1316
- Dict style dataset input for multi-modal and data-weight
1417
- (experimental) cuEquivariance support
15-
16-
### Added (code)
17-
- sevenn.train.modal_dataset SevenNetMultiModalDataset
18-
- sevenn.scripts.backward_compatibility.py
19-
- sevenn.checkpoint.py
18+
- Downloading large checkpoints from url (7net-MF-ompa, 7net-omat)
2019
- D3 wB97M param
2120

2221
### Changed

README.md

+52-16
Original file line numberDiff line numberDiff line change
@@ -5,10 +5,8 @@
55

66
SevenNet (Scalable EquiVariance Enabled Neural Network) is a graph neural network (GNN) interatomic potential package that supports parallel molecular dynamics simulations with [`LAMMPS`](https://lammps.org). Its underlying GNN model is based on [`NequIP`](https://github.com/mir-group/nequip).
77

8-
> [!CAUTION]
9-
> SevenNet+LAMMPS parallel after the commit id of `14851ef (v0.9.3 ~ 0.9.5)` has a serious bug.
10-
> It gives wrong forces when the number of mpi processes is greater than two. The corresponding pip version is yanked for this reason. The bug is fixed for the main branch since `v0.10.0`, and pip (`v0.9.3.post0`).
11-
8+
> [!NOTE]
9+
> We will soon release a CUDA-accelerated version of SevenNet, which will significantly increase the speed of our pre-trained models on [Matbench Discovery](https://matbench-discovery.materialsproject.org/).
1210
1311
## Features
1412
- Pre-trained GNN interatomic potential and fine-tuning interface.
@@ -19,29 +17,66 @@ SevenNet (Scalable EquiVariance Enabled Neural Network) is a graph neural networ
1917

2018
## Pre-trained models
2119
So far, we have released three pre-trained SevenNet models. Each model has various hyperparameters and training sets, resulting in different accuracy and speed. Please read the descriptions below carefully and choose the model that best suits your purpose.
22-
We provide the training set MAEs (energy, force, and stress) F1 score for WBM dataset and $\kappa_{\mathrm{SRME}}$ from phonondb. For details on these metrics and performance comparisons with other pre-trained models, please visit [Matbench Discovery](https://matbench-discovery.materialsproject.org/).
20+
We provide the training set MAEs (energy, force, and stress) F1 score, and RMSD for the WBM dataset, as well as $\kappa_{\mathrm{SRME}}$ from phonondb and CPS (Combined Performance Score). For details on these metrics and performance comparisons with other pre-trained models, please visit [Matbench Discovery](https://matbench-discovery.materialsproject.org/).
2321

2422
These models can be used as interatomic potential on LAMMPS, and also can be loaded through ASE calculator by calling the `keywords` of each model. Please refer [ASE calculator](#ase_calculator) to see the way to load a model through ASE calculator.
2523
Additionally, `keywords` can be called in other parts of SevenNet, such as `sevenn_inference`, `sevenn_get_model`, and `checkpoint` in `input.yaml` for fine-tuning.
2624

2725
**Acknowledgments**: The models trained on [`MPtrj`](https://figshare.com/articles/dataset/Materials_Project_Trjectory_MPtrj_Dataset/23713842) were supported by the Neural Processing Research Center program of Samsung Advanced Institute of Technology, Samsung Electronics Co., Ltd. The computations for training models were carried out using the Samsung SSC-21 cluster.
2826

27+
---
28+
29+
### **SevenNet-MF-ompa (17Mar2025)**
30+
> Model keywords: `7net-mf-ompa` | `SevenNet-mf-ompa`
31+
32+
**This is our recommended pre-trained model**
33+
34+
This model leverages [multi-fidelity learning](https://pubs.acs.org/doi/10.1021/jacs.4c14455) to simultaneously train on the [MPtrj](https://figshare.com/articles/dataset/Materials_Project_Trjectory_MPtrj_Dataset/23713842), [sAlex](https://huggingface.co/datasets/fairchem/OMAT24), and [OMat24](https://huggingface.co/datasets/fairchem/OMAT24) datasets. As of March 17, 2025, it has achieved state-of-the-art performance on the [Matbench Discovery](https://matbench-discovery.materialsproject.org/) in the CPS (Combined Performance Score). We have found that this model outperforms most tasks, except for isolated molecule energy, where it performs slightly worse than SevenNet-l3i5.
35+
36+
```python
37+
from sevenn.calculator import SevenNetCalculator
38+
# "mpa" refers to the MPtrj + sAlex modal, used for evaluating Matbench Discovery.
39+
calc = SevenNetCalculator('7net-mf-ompa', modal='mpa') # Use modal='omat24' for OMat24-trained modal weights.
40+
```
41+
Theoretically, the `mpa` modal should produce PBE52 results, while the `omat24` modal yields PBE54 results.
42+
43+
When using the command-line interface of SevenNet, include the `--modal mpa` or `--modal omat24` option to select the desired modality.
44+
45+
46+
#### **Matbench Discovery**
47+
| CPS | F1 | $\kappa_{\mathrm{SRME}}$ | RMSD |
48+
|:---:|:---:|:---:|:---:|
49+
|**0.883**|**0.901**|0.317| **0.0115** |
50+
51+
[Detailed instructions for multi-fidelity](https://github.com/MDIL-SNU/SevenNet/blob/main/sevenn/pretrained_potentials/SevenNet_MF_0/README.md)
52+
53+
[Link to the full-information checkpoint](https://figshare.com/articles/software/7net_MF_ompa/28590722?file=53029859)
2954

3055
---
56+
### **SevenNet-omat (17Mar2025)**
57+
> Model keywords: `7net-omat` | `SevenNet-omat`
58+
59+
This model was trained solely on the [OMat24](https://huggingface.co/datasets/fairchem/OMAT24) dataset. It achieves state-of-the-art (SOTA) performance in $\kappa_{\mathrm{SRME}}$ on [Matbench Discovery](https://matbench-discovery.materialsproject.org/); however, the F1 score was not available due to a difference in the POTCAR version. Similar to `SevenNet-MF-ompa`, this model outperforms `SevenNet-l3i5` in most tasks, except for isolated molecule energy.
3160

61+
[Link to the full-information checkpoint](https://figshare.com/articles/software/SevenNet_omat/28593938).
62+
63+
#### **Matbench Discovery**
64+
* $\kappa_{\mathrm{SRME}}$: **0.221**
65+
---
3266
### **SevenNet-l3i5 (12Dec2024)**
33-
> Keywords in ASE: `7net-l3i5` and `SevenNet-l3i5`
67+
> Model keywords: `7net-l3i5` | `SevenNet-l3i5`
68+
69+
The model increases the maximum spherical harmonic degree ($l_{\mathrm{max}}$) to 3, compared to `SevenNet-0` with $l_{\mathrm{max}}$ of 2. While **l3i5** offers improved accuracy across various systems compared to `SevenNet-0`, it is approximately four times slower. As of March 17, 2025, this model has achieved state-of-the-art (SOTA) performance on the CPS metric among compliant models, newly introduced in this [Matbench Discovery](https://matbench-discovery.materialsproject.org/).
3470

35-
The model increases the maximum spherical harmonic degree ($l_{\mathrm{max}}$) to 3, compared to **SevenNet-0 (11Jul2024)** with $l_{\mathrm{max}}$ of 2.
36-
While **l3i5** offers improved accuracy across various systems compared to **SevenNet-0 (11Jul2024)**, it is approximately four times slower.
71+
#### **Matbench Discovery**
72+
| CPS | F1 | $\kappa_{\mathrm{SRME}}$ | RMSD |
73+
|:---:|:---:|:---:|:---:|
74+
|0.764 |0.76|0.55|0.0182|
3775

38-
* Training set MAE: 8.3 meV/atom (energy), 0.029 eV/Ang. (force), and 2.33 kbar (stress)
39-
* Matbench F1 score: 0.76, $\kappa_{\mathrm{SRME}}$: 0.560
40-
* Training time: 381 GPU-days on A100
4176
---
4277

4378
### **SevenNet-0 (11Jul2024)**
44-
> Keywords in ASE: `7net-0`, `SevenNet-0`, `7net-0_11Jul2024`, and `SevenNet-0_11Jul2024`
79+
> Model keywords:: `7net-0` | `SevenNet-0` | `7net-0_11Jul2024` | `SevenNet-0_11Jul2024`
4580
4681
The model architecture is mainly line with [GNoME](https://github.com/google-deepmind/materials_discovery), a pretrained model that utilizes the NequIP architecture.
4782
Five interaction blocks with node features that consist of 128 scalars (*l*=0), 64 vectors (*l*=1), and 32 tensors (*l*=2).
@@ -50,9 +85,11 @@ The model was trained with [MPtrj](https://figshare.com/articles/dataset/Materia
5085
This model is loaded as the default pre-trained model in ASE calculator.
5186
For more information, click [here](sevenn/pretrained_potentials/SevenNet_0__11Jul2024).
5287

53-
* Training set MAE: 11.5 meV/atom (energy), 0.041 eV/Ang. (force), and 2.78 kbar (stress)
54-
* Matbench F1 score: 0.67, $\kappa_{\mathrm{SRME}}$: 0.767
55-
* Training time: 90 GPU-days on A100
88+
#### **Matbench Discovery**
89+
| F1 | $\kappa_{\mathrm{SRME}}$ |
90+
|:---:|:---:|
91+
|0.67|0.767|
92+
5693
---
5794

5895
In addition to these latest models, you can find our legacy models from [pretrained_potentials](./sevenn/pretrained_potentials).
@@ -106,7 +143,6 @@ The model can be loaded through the following Python code.
106143
from sevenn.calculator import SevenNetCalculator
107144
calc = SevenNetCalculator(model='7net-0', device='cpu')
108145
```
109-
110146
SevenNet supports CUDA accelerated D3Calculator.
111147
```python
112148
from sevenn.calculator import SevenNetD3Calculator

pyproject.toml

+1
Original file line numberDiff line numberDiff line change
@@ -28,6 +28,7 @@ dependencies = [
2828
"numpy",
2929
"matscipy",
3030
"pandas",
31+
"requests",
3132
]
3233
[project.optional-dependencies]
3334
test = ["matscipy", "pytest-cov>=5"]

setup.cfg

+1-1
Original file line numberDiff line numberDiff line change
@@ -10,5 +10,5 @@ include_trailing_comma=True
1010
force_grid_wrap=0
1111
use_parentheses=True
1212
line_length=80
13-
known_third_party=ase,braceexpand,e3nn,numpy,packaging,pandas,pytest,sklearn,torch,torch_geometric,tqdm,yaml
13+
known_third_party=ase,braceexpand,e3nn,numpy,packaging,pandas,pytest,requests,sklearn,torch,torch_geometric,tqdm,yaml
1414
known_first_party=

sevenn/_const.py

+15-15
Original file line numberDiff line numberDiff line change
@@ -48,20 +48,18 @@
4848
ACTIVATION_DICT = {'e': ACTIVATION_FOR_EVEN, 'o': ACTIVATION_FOR_ODD}
4949

5050
_prefix = os.path.abspath(f'{os.path.dirname(__file__)}/pretrained_potentials')
51-
SEVENNET_0_11Jul2024 = (
52-
f'{_prefix}/SevenNet_0__11Jul2024/checkpoint_sevennet_0.pth'
53-
)
54-
SEVENNET_0_22May2024 = (
55-
f'{_prefix}/SevenNet_0__22May2024/checkpoint_sevennet_0.pth'
56-
)
57-
SEVENNET_l3i5 = (
58-
f'{_prefix}/SevenNet_l3i5/checkpoint_l3i5.pth'
59-
)
60-
SEVENNET_MF_0 = (
61-
f'{_prefix}/SevenNet_MF_0/checkpoint_sevennet_mf_0.pth'
62-
)
63-
64-
51+
SEVENNET_0_11Jul2024 = f'{_prefix}/SevenNet_0__11Jul2024/checkpoint_sevennet_0.pth'
52+
SEVENNET_0_22May2024 = f'{_prefix}/SevenNet_0__22May2024/checkpoint_sevennet_0.pth'
53+
SEVENNET_l3i5 = f'{_prefix}/SevenNet_l3i5/checkpoint_l3i5.pth'
54+
SEVENNET_MF_0 = f'{_prefix}/SevenNet_MF_0/checkpoint_sevennet_mf_0.pth'
55+
SEVENNET_MF_ompa = f'{_prefix}/SevenNet_MF_ompa/checkpoint_sevennet_mf_ompa.pth'
56+
SEVENNET_omat = f'{_prefix}/SevenNet_omat/checkpoint_sevennet_omat.pth'
57+
58+
_git_prefix = 'https://github.com/MDIL-SNU/SevenNet/releases/download'
59+
CHECKPOINT_DOWNLOAD_LINKS = {
60+
SEVENNET_MF_ompa: f'{_git_prefix}/v0.11.0.cp/checkpoint_sevennet_mf_ompa.pth',
61+
SEVENNET_omat: f'{_git_prefix}/v0.11.0.cp/checkpoint_sevennet_omat.pth',
62+
}
6563
# to avoid torch script to compile torch_geometry.data
6664
AtomGraphDataType = Dict[str, torch.Tensor]
6765

@@ -143,7 +141,9 @@ def error_record_condition(x):
143141
},
144142
KEY.CUTOFF: float,
145143
KEY.NUM_CONVOLUTION: int,
146-
KEY.CONV_DENOMINATOR: lambda x: isinstance(x, float) or x in [
144+
KEY.CONV_DENOMINATOR: lambda x: isinstance(x, float)
145+
or x
146+
in [
147147
'avg_num_neigh',
148148
'sqrt_avg_num_neigh',
149149
],

sevenn/calculator.py

+64-29
Original file line numberDiff line numberDiff line change
@@ -2,7 +2,7 @@
22
import os
33
import pathlib
44
import warnings
5-
from typing import Any, Optional, Union
5+
from typing import Any, Dict, Optional, Union
66

77
import numpy as np
88
import torch
@@ -22,17 +22,12 @@
2222

2323

2424
class SevenNetCalculator(Calculator):
25-
"""ASE calculator for SevenNet models
25+
"""Supporting properties:
26+
'free_energy', 'energy', 'forces', 'stress', 'energies'
27+
free_energy equals energy. 'energies' stores atomic energy.
2628
27-
Multi-GPU parallel MD is not supported for this mode.
28-
Use LAMMPS for multi-GPU parallel MD.
29-
This class is for convenience who want to run SevenNet models with ase.
30-
31-
Note than ASE calculator is designed to be interface of other programs.
32-
But in this class, we simply run torch model inside ASE calculator.
33-
So there is no FileIO things.
34-
35-
Here, free_energy = energy
29+
Multi-GPU acceleration is not supported with ASE calculator.
30+
You should use LAMMPS for the acceleration.
3631
"""
3732

3833
def __init__(
@@ -42,14 +37,28 @@ def __init__(
4237
device: Union[torch.device, str] = 'auto',
4338
modal: Optional[str] = None,
4439
enable_cueq: bool = False,
45-
sevennet_config: Optional[Any] = None, # hold meta information
40+
sevennet_config: Optional[Dict] = None, # Not used in logic, just meta info
4641
**kwargs,
4742
):
48-
"""Initialize the calculator
49-
50-
Args:
51-
model (SevenNet): path to the checkpoint file, or pretrained
52-
device (str, optional): Torch device to use. Defaults to "auto".
43+
"""Initialize SevenNetCalculator.
44+
45+
Parameters
46+
----------
47+
model: str | Path | AtomGraphSequential, default='7net-0'
48+
Name of pretrained models (7net-mf-ompa, 7net-omat, 7net-l3i5, 7net-0) or
49+
path to the checkpoint, deployed model or the model itself
50+
file_type: str, default='checkpoint'
51+
one of 'checkpoint' | 'torchscript' | 'model_instance'
52+
device: str | torch.device, default='auto'
53+
if not given, use CUDA if available
54+
modal: str | None, default=None
55+
modal (fidelity) if given model is multi-modal model. for 7net-mf-ompa,
56+
it should be one of 'mpa' (MPtrj + sAlex) or 'omat24' (OMat24)
57+
case insensitive
58+
enable_cueq: bool, default=False
59+
if True, use cuEquivariant to accelerate inference.
60+
sevennet_config: dict | None, default=None
61+
Not used, but can be used to carry meta information of this calculator
5362
"""
5463
super().__init__(**kwargs)
5564
self.sevennet_config = None
@@ -131,18 +140,21 @@ def __init__(
131140

132141
self.model = model_loaded
133142

134-
if isinstance(self.model, AtomGraphSequential) and modal:
135-
if self.model.modal_map is None:
136-
raise ValueError('Modality given, but model has no modal_map')
137-
if modal not in self.model.modal_map:
138-
_modals = list(self.model.modal_map.keys())
139-
raise ValueError(f'Unknown modal {modal} (not in {_modals})')
143+
self.modal = None
144+
if isinstance(self.model, AtomGraphSequential):
145+
modal_map = self.model.modal_map
146+
if modal_map:
147+
modal_ava = list(modal_map.keys())
148+
if not modal:
149+
raise ValueError(f'modal argument missing (avail: {modal_ava})')
150+
elif modal not in modal_ava:
151+
raise ValueError(f'unknown modal {modal} (not in {modal_ava})')
152+
self.modal = modal
153+
elif not self.model.modal_map and modal:
154+
warnings.warn(f'modal={modal} is ignored as model has no modal_map')
140155

141156
self.model.to(self.device)
142157
self.model.eval()
143-
144-
self.modal = modal
145-
146158
self.implemented_properties = [
147159
'free_energy',
148160
'energy',
@@ -216,6 +228,31 @@ def __init__(
216228
cn_cutoff: float = 1600, # au^2, 0.52917726 angstrom = 1 au
217229
**kwargs,
218230
):
231+
"""Initialize SevenNetD3Calculator. CUDA required.
232+
233+
Parameters
234+
----------
235+
model: str | Path | AtomGraphSequential
236+
Name of pretrained models (7net-mf-ompa, 7net-omat, 7net-l3i5, 7net-0) or
237+
path to the checkpoint, deployed model or the model itself
238+
file_type: str, default='checkpoint'
239+
one of 'checkpoint' | 'torchscript' | 'model_instance'
240+
device: str | torch.device, default='auto'
241+
if not given, use CUDA if available
242+
modal: str | None, default=None
243+
modal (fidelity) if given model is multi-modal model. for 7net-mf-ompa,
244+
it should be one of 'mpa' (MPtrj + sAlex) or 'omat24' (OMat24)
245+
enable_cueq: bool, default=False
246+
if True, use cuEquivariant to accelerate inference.
247+
damping_type: str, default='damp_bj'
248+
Damping type of D3, one of 'damp_bj' | 'damp_zero'
249+
functional_name: str, default='pbe'
250+
Target functional name of D3 parameters.
251+
vdw_cutoff: float, default=9000
252+
vdw cutoff of D3 calculator in au
253+
cn_cutoff: float, default=1600
254+
cn cutoff of D3 calculator in au
255+
"""
219256
d3_calc = D3Calculator(
220257
damping_type=damping_type,
221258
functional_name=functional_name,
@@ -267,9 +304,7 @@ def _load(name: str) -> ctypes.CDLL:
267304

268305
load(
269306
name=name,
270-
sources=[
271-
os.path.join(package_dir, 'pair_e3gnn', 'pair_d3_for_ase.cu')
272-
],
307+
sources=[os.path.join(package_dir, 'pair_e3gnn', 'pair_d3_for_ase.cu')],
273308
extra_cuda_cflags=['-O3', '--expt-relaxed-constexpr', '-fmad=false'],
274309
build_directory=compile_dir,
275310
verbose=True,

0 commit comments

Comments
 (0)