We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Señor @RaulPPelaez @AdriaPerezCulubret @peastman @stefdoerr @guillemsimeon I'm using the following parameters during training of alanine-dipeptide. But after training, I'm getting some crazy ckpt files. Are these hyperparameters are fine? rw------- 1 cyz218385 cyz21 3.5M Feb 4 20:00 epoch=1-val_loss=28354817745406260133494784.0000-test_loss=0.0000.ckpt -rw------- 1 cyz218385 cyz21 3.5M Feb 4 20:01 epoch=3-val_loss=28354817745406260133494784.0000-test_loss=4108202541056.0000.ckpt -rw------- 1 cyz218385 cyz21 3.5M Feb 4 20:03 epoch=5-val_loss=28354817745406260133494784.0000-test_loss=4108202541056.0000.ckpt -rw------- 1 cyz218385 cyz21 3.5M Feb 4 20:04 epoch=7-val_loss=28354817745406260133494784.0000-test_loss=4108202541056.0000.ckpt -rw------- 1 cyz218385 cyz21 3.5M Feb 4 20:05 epoch=9-val_loss=28354817745406260133494784.0000-test_loss=4108202541056.0000.ckpt -rw------- 1 cyz218385 cyz21 3.5M Feb 4 20:07 epoch=11-val_loss=28354817745406260133494784.0000-test_loss=4108202541056.0000.ckpt -rw------- 1 cyz218385 cyz21 3.5M Feb 4 20:08 epoch=13-val_loss=28354817745406260133494784.0000-test_loss=4108202541056.0000.ckpt -rw------- 1 cyz218385 cyz21 3.5M Feb 4 20:10 epoch=15-val_loss=28354817745406260133494784.0000-test_loss=4108202541056.0000.ckpt -rw------- 1 cyz218385 cyz21 3.5M Feb 4 20:11 epoch=17-val_loss=28354817745406260133494784.0000-test_loss=4108202541056.0000.ckpt -rw------- 1 cyz218385 cyz21 3.5M Feb 4 20:13 epoch=19-val_loss=28354817745406260133494784.0000-test_loss=4108202541056.0000.ckpt
TRAIN.YAML batch_size: 256 #inference_batchsize: 256 dataset: Custom coord_files: "31jan_coords.npy" embed_files: "31jan_ca_embeddings.npy" force_files: "31jan_ca_deltaforces.npy" cutoff_upper: 12.0 cutoff_lower: 3.0 log_dir: /home/chemistry/phd/cyz218385/scratch/aladi_wat/aladi_300k derivative: true #distributed_backend: ddp early_stopping_patience: 30 embedding_dimension: 128
#label: #- forces lr: 0.0005 lr_factor: 0.8 lr_min: 1.0e-06 lr_patience: 10 lr_warmup_steps: 0 model: graph-network neighbor_embedding: false ngpus: -1 num_epochs: 100 num_layers: 4 num_nodes: 1 num_rbf: 18 num_workers: 8 rbf_type: expnorm save_interval: 2 seed: 1 test_interval: 2 test_size: 80 trainable_rbf: true val_size: 20 weight_decay: 0.0
The text was updated successfully, but these errors were encountered:
No branches or pull requests
Señor @RaulPPelaez @AdriaPerezCulubret @peastman @stefdoerr @guillemsimeon I'm using the following parameters during training of alanine-dipeptide. But after training, I'm getting some crazy ckpt files. Are these hyperparameters are fine?
rw------- 1 cyz218385 cyz21 3.5M Feb 4 20:00 epoch=1-val_loss=28354817745406260133494784.0000-test_loss=0.0000.ckpt
-rw------- 1 cyz218385 cyz21 3.5M Feb 4 20:01 epoch=3-val_loss=28354817745406260133494784.0000-test_loss=4108202541056.0000.ckpt
-rw------- 1 cyz218385 cyz21 3.5M Feb 4 20:03 epoch=5-val_loss=28354817745406260133494784.0000-test_loss=4108202541056.0000.ckpt
-rw------- 1 cyz218385 cyz21 3.5M Feb 4 20:04 epoch=7-val_loss=28354817745406260133494784.0000-test_loss=4108202541056.0000.ckpt
-rw------- 1 cyz218385 cyz21 3.5M Feb 4 20:05 epoch=9-val_loss=28354817745406260133494784.0000-test_loss=4108202541056.0000.ckpt
-rw------- 1 cyz218385 cyz21 3.5M Feb 4 20:07 epoch=11-val_loss=28354817745406260133494784.0000-test_loss=4108202541056.0000.ckpt
-rw------- 1 cyz218385 cyz21 3.5M Feb 4 20:08 epoch=13-val_loss=28354817745406260133494784.0000-test_loss=4108202541056.0000.ckpt
-rw------- 1 cyz218385 cyz21 3.5M Feb 4 20:10 epoch=15-val_loss=28354817745406260133494784.0000-test_loss=4108202541056.0000.ckpt
-rw------- 1 cyz218385 cyz21 3.5M Feb 4 20:11 epoch=17-val_loss=28354817745406260133494784.0000-test_loss=4108202541056.0000.ckpt
-rw------- 1 cyz218385 cyz21 3.5M Feb 4 20:13 epoch=19-val_loss=28354817745406260133494784.0000-test_loss=4108202541056.0000.ckpt
TRAIN.YAML
batch_size: 256
#inference_batchsize: 256
dataset: Custom
coord_files: "31jan_coords.npy"
embed_files: "31jan_ca_embeddings.npy"
force_files: "31jan_ca_deltaforces.npy"
cutoff_upper: 12.0
cutoff_lower: 3.0
log_dir: /home/chemistry/phd/cyz218385/scratch/aladi_wat/aladi_300k
derivative: true
#distributed_backend: ddp
early_stopping_patience: 30
embedding_dimension: 128
#label:
#- forces
lr: 0.0005
lr_factor: 0.8
lr_min: 1.0e-06
lr_patience: 10
lr_warmup_steps: 0
model: graph-network
neighbor_embedding: false
ngpus: -1
num_epochs: 100
num_layers: 4
num_nodes: 1
num_rbf: 18
num_workers: 8
rbf_type: expnorm
save_interval: 2
seed: 1
test_interval: 2
test_size: 80
trainable_rbf: true
val_size: 20
weight_decay: 0.0
The text was updated successfully, but these errors were encountered: