Refactor calculators #17

gelzinyte · 2022-01-09T21:01:17Z

Rewrite interfaces to dft calculators (e.g. castep.evaluate_op) into classes inheriting from ASE calculator. Workflows may then be calculator agnostic by using the generic calculator. And choose different ways of parallelise the calculations, for example generic.run directly or generic.run_op within a function that is then parallelised with iterable_loop. An example is in generate_configs.vib.Vibrations.derive_normal_mode_info()

The text was updated successfully, but these errors were encountered:

bernstei · 2022-01-10T13:15:00Z

This is written like it's already a PR ("An example is in..."). If that's intended, can you create the PR (I think you can link it to this issue with closes #17 in the description.

We could also consider combining it with my new iloop wrapper that's a bit more like a decorator that the current approach (although not using python @ notation), if we decide it's a nicer interface.

gelzinyte · 2022-07-13T13:03:22Z

@gelzinyte For now I'll take another look at the ASE-extended ORCA calculator, and see if I have any comments relevant to doing the something similar to the other DFT calculators. [edited] I'll comment in #17. Also, can you take another look at #16 , and see if it can be closed?

Great, let me know what you think. Note that we added a bunch of features to ORCA, on top of file handling, so implementing file handling for other DFT calculators would be a bit less of a hassle. All the relevant changes should only be in ORCA.__init__() and ORCA.calculate().

bernstei · 2022-07-13T14:41:01Z

One thing that might be nice (although maybe beyond the scope of this issue) is to have a way to have multiple reusable directories. If you are doing something like NEB, you might want each image to re-use its own the saved charge density/wavefunctions from the previous iteration. That would require passing to the calculator (probably the constructor) something, e.g. a suffix, to define the run directory (and of course the NEB code knowing to pass it).

gelzinyte · 2022-07-13T14:53:33Z

yes, I was thinking that too. Another example - imagine only some of the autoparallelized DFT evaluations have succeeded. Then it would be nice to re-run the same script, but if the results are already there they would be read in. And only the non-succeeded calculations would have to be restarted.

I don't think it would be very difficult? Although probably better done as a separate issue/PR if there's need. First, we could make the temporary directories' names be Atoms-dependent hashes (for example hash of (atomic numbers, positions, calc_settings) or a pre-calculated hash in Atoms.info in NEB's case where the positions change). In addition, keep_files should be set to True (or to whatever files must be kept for a restart). Finally, you'd only need to set a flag for "reuse previous calc results".

bernstei · 2022-07-13T15:01:09Z

Auto-parallelization is a whole can of worms. First, there's the question of why you think a rerun will work? Have you done something to change at the expyre level, e.g. a longer runtime? Have you changed some DFT input param (max # of SCF steps)? Is it just that the node crashed and you want to try again without changing anything? All of that will affect what you think you can retain from the previous run.

I agree that it's worth thinking about, but it's not trivial. When it's just something like runtime, or number of nodes, I tend to manually log into the HPC machine, go to the relevant directory under ~/run_expyre, edit the job file by hand, qsub by hand, wait for it to finish, then rerun the wfl task so it gathers the results. Note that right now you can't rerun the wfl before the job is done, because it'll be confused about the different jobid when doing qstat. I want to fix that (maybe not even store the remote jobid, but just extract it from the output of qstat filtered by job name, which is already a unique string from the same hash as the job run directory), but haven't had time to think about the details. Maybe I should open an expyre issue. [done: https://github.com/libAtoms/ExPyRe/issues/23]

bernstei · 2022-07-13T17:35:33Z

The derived class ORCA looks mostly fine, although I don't really understand why it seems to be reimplementing stuff that I'd expect to be in the ASE original. Should we split the class into a different source file than the associated utility functions?

However, we should also think about whether we can make a more generalized structure, since I'm assuming that things like doing the mkdir will be needed for all of the DFT calculators, but there will probably always be some things that are really one-off, like the VASP pbc mangling. Should we do something similar for another calculator and see what they have in common?

gelzinyte · 2022-08-09T15:41:52Z

Maybe I should open an expyre issue

Thanks, I think this will be very handy - re-running failed jobs in _expyre seems intuitive, if a bit hack-y.

Should we do something similar for another calculator and see what they have in common?

I will try and implement a Castep example today/tomorrow and open a PR.

although I don't really understand why it seems to be reimplementing stuff that I'd expect to be in the ASE original

We have added a lot of extra functionality, in some cases very specific and in some general but not contributed back to ASE. We had to over-write some of the ASE-orca functions to extend:

write_input: automatically pick default multiplicity, option for geometry optimisation in addition to single point.
read_results: raise error if wavefunction isn't converged (instead of silently skipping); read dipole, optimised geometry, trajectory, frequencies

And the rest is just not implemented in ASE-Orca.

bernstei · 2022-08-09T15:49:58Z

I'm not sure exactly when I'll have time, but I'll try to prioritize ExPyRe issue 23 when I have time to code.

I'm find with extending ORCA relative to ASE. Do you think it's worth it to have two separate classes, one derived from the other. The first with just extensions over the ASE interface (i.e. that parts that might in principle be worth contributing back), and the second that just contains the wfl-specific implementation?

gelzinyte · 2022-08-09T16:17:15Z

I'm not sure exactly when I'll have time, but I'll try to prioritize ExPyRe issue 23 when I have time to code.

I misread your comment and thought that that issue is an already merged PR that I have missed. It would be a very convenient feature, but definitely not at all urgent or even high on priority list.

Do you think it's worth it to have two separate classes

I would rather not split it, because we would end up triplicating some things and I am not sure if the exercise is wroth it? For example, dipole calculations might be useful to contribute to ASE, but geometry optimisation isn't compatible with ASE. So the both of the classes would need to overwrite the inherited read_results. Instead of splitting into two classes I would rather try and contribute the relevant bits to ASE, if you think that's worthwhile?

bernstei · 2022-08-09T16:46:46Z

I've had mixed success contributing to ASE. If you think it's stuff other people might use it's probably worth it, because it's not very much work to start (clone the gitlab repo, add the changes, make an MR), but we'll see how the developers respond. I've had some times when they are basically reasonable, some times when the changes they wanted were not (in my opinion) excessive (e.g. modernize/fix other bugs in the same class just because I was fixing one thing), and most recently I just got no response.

bernstei · 2022-10-14T15:15:13Z

VASP, QE, and ORCA are now derived from the underlying ASE Calculator classes. Castep is a separate issue, because it seems more nonstandard, so will be dealt with in a separate issue.

bernstei mentioned this issue Jul 13, 2022

Quantum Espresso GeoOpt within ASE #124

Open

gelzinyte mentioned this issue Aug 15, 2022

wfl's QE calculator with directory handling. #136

Merged

bernstei closed this as completed Oct 14, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Refactor calculators #17

Refactor calculators #17

gelzinyte commented Jan 9, 2022

bernstei commented Jan 10, 2022

gelzinyte commented Jul 13, 2022

bernstei commented Jul 13, 2022

gelzinyte commented Jul 13, 2022

bernstei commented Jul 13, 2022 •

edited

Loading

bernstei commented Jul 13, 2022 •

edited

Loading

gelzinyte commented Aug 9, 2022

bernstei commented Aug 9, 2022

gelzinyte commented Aug 9, 2022

bernstei commented Aug 9, 2022 •

edited

Loading

bernstei commented Oct 14, 2022

Refactor calculators #17

Refactor calculators #17

Comments

gelzinyte commented Jan 9, 2022

bernstei commented Jan 10, 2022

gelzinyte commented Jul 13, 2022

bernstei commented Jul 13, 2022

gelzinyte commented Jul 13, 2022

bernstei commented Jul 13, 2022 • edited Loading

bernstei commented Jul 13, 2022 • edited Loading

gelzinyte commented Aug 9, 2022

bernstei commented Aug 9, 2022

gelzinyte commented Aug 9, 2022

bernstei commented Aug 9, 2022 • edited Loading

bernstei commented Oct 14, 2022

bernstei commented Jul 13, 2022 •

edited

Loading

bernstei commented Jul 13, 2022 •

edited

Loading

bernstei commented Aug 9, 2022 •

edited

Loading