Skip to content

Commit cd75a36

Browse files
iProzdjanosh
andauthored
Update DPA3-v1 to DPA3-v2 (#222)
* Update DeePMD models to dpa3-v2 * Update model-schema.yml * add test_dpa3_kappa * Update test_dpa3_kappa.py * add back v1 * Update dpa3-v1-openlam.yml * set status: superseded for DPA3 v1 models add energy parity figures for DPA3 models and per-element EACH errors describe model.status schema values in YAML comments * reupload pred files for dpa3-v2-mptrj.yml and dpa3-v2-openlam.yml to own figshare - add geo_opt metrics in both YAML files - upload_model_preds_to_figshare.py script now supports exclusion prefixes for models and tasks, allowing users to specify items to exclude from processing * fix RMSD colorscale in GeoOptMetricsTable not rendering lower as better - explain group header index logic in site/src/lib/HeatmapTable.svelte --------- Co-authored-by: Janosh Riebesell <[email protected]>
1 parent 3ace0bd commit cd75a36

21 files changed

+768
-73
lines changed

data/training-sets.yml

+13-1
Original file line numberDiff line numberDiff line change
@@ -54,7 +54,8 @@ Alex:
5454
OMat24:
5555
title: OMat24
5656
url: https://huggingface.co/datasets/fairchem/OMAT24#omat24-dataset
57-
# this is the number of Alexandria materials that was sampled from, but according to Luis unclear if all were indeed sampled so this is an upper bound
57+
# this is the number of Alexandria materials that was sampled from, but according to Luis unclear
58+
# if all were indeed picked so this is an upper bound on the number of materials
5859
n_materials: 3_227_606
5960
n_structures: 100_824_585
6061
open: true
@@ -74,3 +75,14 @@ sAlex Validation:
7475
n_materials: 170_905 # approximate value! TODO confirm this
7576
n_structures: 553_218
7677
open: true
78+
79+
OpenLAM:
80+
title: OpenLAM dataset v1
81+
url: https://aissquare.com/datasets/detail?pageType=datasets&name=LAMBench-TrainingSet-v1&id=308
82+
download_url: https://aissquare.com/datasets/detail?pageType=datasets&name=LAMBench-TrainingSet-v1&id=308 # will be combined for downloading soon
83+
n_structures: 162_507_178
84+
open: true
85+
description: |
86+
This dataset integrates multidisciplinary DFT data sourced from Deep Modeling community (https://deepmodeling.com)
87+
and other open repositories to pre-train large atomic models (LAMs),
88+
while intentionally excluding overlap with WBM benchmark systems (e.g., Alex3D structures).

matbench_discovery/enums.py

+4-2
Original file line numberDiff line numberDiff line change
@@ -302,8 +302,10 @@ class Model(Files, base_dir=f"{ROOT}/models"):
302302
cgcnn_p = auto(), "cgcnn/cgcnn+p.yml"
303303

304304
# DeepMD-DPA3 models
305-
dpa3_v1_mptrj = auto(), "deepmd/dpa3-v1-mptrj.yml"
306-
dpa3_v1_openlam = auto(), "deepmd/dpa3-v1-openlam.yml"
305+
dpa3_v2_mptrj = auto(), "deepmd/dpa3-v2-mptrj.yml"
306+
dpa3_v2_openlam = auto(), "deepmd/dpa3-v2-openlam.yml"
307+
# dpa3_v1_mptrj = auto(), "deepmd/dpa3-v1-mptrj.yml"
308+
# dpa3_v1_openlam = auto(), "deepmd/dpa3-v1-openlam.yml"
307309

308310
# FAIR-Chem
309311
eqv2_s_dens = auto(), "eqV2/eqV2-s-dens-mp.yml"

models/deepmd/dpa3-v1-mptrj.yml

+1
Original file line numberDiff line numberDiff line change
@@ -40,6 +40,7 @@ targets: EFS_G
4040
model_type: UIP
4141
model_params: 3_374_647
4242
n_estimators: 1
43+
status: superseded
4344

4445
hyperparams:
4546
max_force: 0.05

models/deepmd/dpa3-v1-openlam.yml

+2-1
Original file line numberDiff line numberDiff line change
@@ -40,6 +40,7 @@ targets: EFS_G
4040
model_type: UIP
4141
model_params: 8_184_608
4242
n_estimators: 1
43+
status: superseded
4344

4445
hyperparams:
4546
max_force: 0.05
@@ -93,7 +94,7 @@ requirements:
9394
pymatgen: 2024.6.10
9495
numpy: 1.26.4
9596

96-
training_set: [OMat24, MPtrj, sAlex] # need to update to OpenLAM
97+
training_set: [OpenLAM]
9798

9899
notes:
99100
Description: |

models/deepmd/dpa3-v2-mptrj.yml

+181
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,181 @@
1+
model_name: DPA3-v2-MPtrj
2+
model_key: dpa3-v2-mptrj
3+
model_version: v0.2 # 2025-03-14
4+
matbench_discovery_version: 1.3.1
5+
date_added: "2025-03-14"
6+
date_published: "2025-03-14"
7+
authors:
8+
- name: Duo Zhang
9+
affiliation: AI for Science Institute, Beijing
10+
orcid: https://orcid.org/0000-0001-9591-2659
11+
- name: Anyang Peng
12+
affiliation: AI for Science Institute, Beijing
13+
orcid: https://orcid.org/0000-0002-0630-2187
14+
- name: Chun Cai
15+
affiliation: AI for Science Institute, Beijing
16+
orcid: https://orcid.org/0000-0001-6242-0439
17+
- name: Linfeng Zhang
18+
affiliation: AI for Science Institute, Beijing; DP Technology
19+
20+
corresponding: true
21+
- name: Han Wang
22+
affiliation: Laboratory of Computational Physics, Institute of Applied Physics and Computational Mathematics
23+
24+
corresponding: true
25+
trained_by:
26+
- name: Duo Zhang
27+
affiliation: AI for Science Institute, Beijing
28+
orcid: https://orcid.org/0000-0001-9591-2659
29+
repo: https://github.com/deepmodeling/deepmd-kit/tree/devel
30+
url: https://github.com/deepmodeling/deepmd-kit/tree/devel
31+
doi: https://github.com/deepmodeling/deepmd-kit/tree/devel # to be released soon
32+
paper: https://github.com/deepmodeling/deepmd-kit/tree/devel # to be released soon
33+
pr_url: https://github.com/janosh/matbench-discovery/pull/222
34+
trained_for_benchmark: true
35+
36+
openness: OSOD
37+
train_task: S2EFS
38+
test_task: IS2RE-SR
39+
targets: EFS_G
40+
model_type: UIP
41+
model_params: 4_923_959
42+
n_estimators: 1
43+
44+
hyperparams:
45+
max_force: 0.05
46+
max_steps: 500
47+
ase_optimizer: FIRE
48+
cell_filter: ExpCellFilter
49+
n_layers: 24
50+
e_rcut: 6.0
51+
a_rcut: 4.0
52+
n_dim: 128
53+
e_dim: 64
54+
a_dim: 32
55+
optimizer: Adam
56+
round1:
57+
loss: MSE
58+
loss_weights:
59+
energy: 0.2 -> 20
60+
force: 100 -> 20
61+
virial: 0.02 -> 1
62+
initial_learning_rate: 0.001
63+
learning_rate_schedule: ExpLR - start_lr=0.001, decay_steps=5000, stop_lr=0.00001
64+
training_steps: 2000000
65+
round2:
66+
loss: Huber
67+
loss_weights:
68+
energy: 15
69+
force: 1
70+
virial: 2.5
71+
initial_learning_rate: 0.0002
72+
learning_rate_schedule: ExpLR - start_lr=0.0002, decay_steps=5000, stop_lr=0.00001
73+
training_steps: 1000000
74+
batch_size: 64 # 16 (gpus) * 4 (batch per gpu) = 64 (total batch size)
75+
epochs: 120 # round1 80 + round2 40
76+
77+
requirements:
78+
torch: 2.3.1
79+
torch-geometric: 2.5.2
80+
ase: 3.23.0
81+
pymatgen: 2024.6.10
82+
numpy: 1.26.4
83+
84+
training_set: [MPtrj]
85+
86+
notes:
87+
Description: |
88+
DPA3 is an advanced interatomic potential leveraging the message passing architecture, implemented within the DeePMD-kit framework, available at GitHub(https://github.com/deepmodeling/deepmd-kit/tree/devel).
89+
Designed as a large atomic model (LAM), DPA3 is tailored to integrate and simultaneously train on datasets from various disciplines, encompassing diverse chemical and materials systems across different research domains.
90+
Its model design ensures exceptional fitting accuracy and robust generalization both within and beyond the training domain.
91+
Furthermore, DPA3 maintains energy conservation and respects the physical symmetries of the potential energy surface, making it a dependable tool for a wide range of scientific applications.
92+
93+
metrics:
94+
phonons:
95+
kappa_103:
96+
κ_SRME: 0.959
97+
pred_file: models/deepmd/dpa3-v2-mptrj/2025-03-14-kappa-103-FIRE-dist=0.01-fmax=1e-4-symprec=1e-5.json.gz
98+
pred_file_url: https://figshare.com/files/52988744
99+
geo_opt:
100+
pred_file: models/deepmd/dpa3-v2-mptrj/2025-03-14-wbm-geo-opt.json.gz
101+
struct_col: dp_structure
102+
pred_file_url: https://figshare.com/files/53018849
103+
symprec=1e-2:
104+
rmsd: 0.0164 # Å
105+
n_sym_ops_mae: 1.968 # unitless
106+
symmetry_decrease: 0.0601 # fraction
107+
symmetry_match: 0.8052 # fraction
108+
symmetry_increase: 0.1273 # fraction
109+
n_structures: 256963 # count
110+
analysis_file: models/deepmd/dpa3-v2-mptrj/2025-03-14-wbm-geo-opt-symprec=1e-2-moyo=0.4.2.csv.gz
111+
analysis_file_url: https://figshare.com/files/53019278
112+
symprec=1e-5:
113+
rmsd: 0.0164 # Å
114+
n_sym_ops_mae: 2.1461 # unitless
115+
symmetry_decrease: 0.0766 # fraction
116+
symmetry_match: 0.7154 # fraction
117+
symmetry_increase: 0.2014 # fraction
118+
n_structures: 256963 # count
119+
analysis_file: models/deepmd/dpa3-v2-mptrj/2025-03-14-wbm-geo-opt-symprec=1e-5-moyo=0.4.2.csv.gz
120+
analysis_file_url: https://figshare.com/files/53019281
121+
discovery:
122+
pred_file: models/deepmd/dpa3-v2-mptrj/2025-03-14-wbm-IS2RE.csv.gz
123+
pred_file_url: https://figshare.com/files/53018801
124+
pred_col: e_form_per_atom_dp
125+
full_test_set:
126+
F1: 0.774 # fraction
127+
DAF: 4.25 # dimensionless
128+
Precision: 0.729 # fraction
129+
Recall: 0.825 # fraction
130+
Accuracy: 0.917 # fraction
131+
TPR: 0.825 # fraction
132+
FPR: 0.064 # fraction
133+
TNR: 0.936 # fraction
134+
FNR: 0.175 # fraction
135+
TP: 36393.0 # count
136+
FP: 13519.0 # count
137+
TN: 199352.0 # count
138+
FN: 7699.0 # count
139+
MAE: 0.038 # eV/atom
140+
RMSE: 0.082 # eV/atom
141+
R2: 0.796 # dimensionless
142+
missing_preds: 0 # count
143+
missing_percent: 0.00% # fraction
144+
most_stable_10k:
145+
F1: 0.980 # fraction
146+
DAF: 6.280 # dimensionless
147+
Precision: 0.960 # fraction
148+
Recall: 1.0 # fraction
149+
Accuracy: 0.960 # fraction
150+
TPR: 1.0 # fraction
151+
FPR: 1.0 # fraction
152+
TNR: 0.0 # fraction
153+
FNR: 0.0 # fraction
154+
TP: 9600.0 # count
155+
FP: 400.0 # count
156+
TN: 0.0 # count
157+
FN: 0.0 # count
158+
MAE: 0.032 # eV/atom
159+
RMSE: 0.078 # eV/atom
160+
R2: 0.866 # dimensionless
161+
missing_preds: 0 # count
162+
missing_percent: 0.00% # fraction
163+
unique_prototypes:
164+
F1: 0.786 # fraction
165+
DAF: 4.760 # dimensionless
166+
Precision: 0.737 # fraction
167+
Recall: 0.841 # fraction
168+
Accuracy: 0.929 # fraction
169+
TPR: 0.841 # fraction
170+
FPR: 0.055 # fraction
171+
TNR: 0.945 # fraction
172+
FNR: 0.159 # fraction
173+
TP: 28073.0 # count
174+
FP: 10008.0 # count
175+
TN: 172106.0 # count
176+
FN: 5301.0 # count
177+
MAE: 0.039 # eV/atom
178+
RMSE: 0.081 # eV/atom
179+
R2: 0.804 # dimensionless
180+
missing_preds: 0 # count
181+
missing_percent: 0.00% # fraction

0 commit comments

Comments
 (0)