-
Notifications
You must be signed in to change notification settings - Fork 130
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Test esmvalcore=2.3.0 with current batch of ESMValTool recipes #2198
Comments
The usual procedure is to update the ESMValTool repository so it uses the latest version of ESMValCore immediately after the ESMValcore release. That way there are two weeks in which diagnostic developers can try out if their recipes work with the latest version of the ESMValCore before we release ESMValTool.
The bot uses the branch in the ESMValTool repository that you're requesting a test for, so if that branch uses the latest version of ESMValCore, the bot will also use it. Should we move this issue to the ESMValTool repository, as it is about testing recipes in ESMValTool? |
yeah, I was thinking of moving it, good point, man! |
For
This is the affected dataset: Here is the log file: |
Thanks for looking into this, @schlunma! For the path you posted I only get |
I can't find the data anymore on the official archive - since I got it from there it must have been retracted. Nevertheless, I found the issue (the first value of the bound of the time coordinate is smaller than |
No objection in principle. Just wasn't able to determine what exactly was going on. Do you want to make a separate issue describing the problem? |
Yes, will do. Found another issue, this time in This can be solved by adapting the recipe (with |
I got an error with the
|
I also encountered a problem with the
EDIT from @valeriupredoi (meself) - ping @Peter9192 🍺 |
autoassess stratosphere is succombing to a plotting error related to nc-time-axis:
fairly sure this is due to the new Matplotlib 3.4.2 not liking the older nc-time-axis but nc-time-axis=1.3.1 can not be installed in current configuration, see SciTools/nc-time-axis#71 |
|
@zklaus It might be the case to revise a little the recipe |
From the log, it looks like some datasets (bcc-csm1-1, GFDL-ESM2G) lose the ancillary variables after the |
Sure, the error can be circumvented by adding another dataset but it would be good to include that in the recipe as @tomaslovato suggests. We may not need to have multi-model statistics working on a single dataset but the recipe becomes unusable in its current form for the new release. It would also be good that we work on "minimal configurations" for testing the recipes for the releases. At the moment, many of the recipes in the list can't be tested directly (at least on Mistral) because of data availability... |
@remi-kazeroni, you are completely right. The point of this issue here is to determine if a bug slipped us by in making the core release that necessitates a bugfix release for the core. It is of course always possible that a core release entails changes to the recipes; those are completely fine and should be dealt with in this period between the core release and the tool release. |
Sorry for posting these errors from failing recipes without investigating in detail... This time it is about
|
There is also a problem in recipe_deangelis15nat.yml with the derived variable lvp. Although the error messages says it could be missing data I don't think that this is the reason, but I'll try to understand what is the issue. Units shouldn't be the issue, the equation should take care of this. The error says: recipe_li17natcc.yml ran without problems. |
Some model seems to miss a fix to have the same number of digits for the latitude for each variable. |
recipe_martin18grl.yml runs but I found an issue (probably related to longitude) in some of the plots, (e.g. SPI_mapObservations_Dur_of_Events_Mean.png). But this is not related to the new core release, I checked that it also happend with the old core last year. I'll look at it after the issues related to the release. |
For recipe_deangelis15nat.yml: should I remove MIROC5 or is it possible to do the rounding from ESMValGroup/ESMValCore#1110 also for evspsbl and hfls now, after the Core has been released? |
Without MIROC5 recipe_deangelis15nat.yml runs. |
I ran perfmetrics, smpi and gier2020bg which had no problems with ncl. Manuel also ran anav judging from the list. And Remi ran the ncl test recipe. Can you see if there's a error further up in the log for why ncdf_write is undefined for you? I did change that function a bit to allow for writing several variables into one netcdf file, but hadn't found any issues testing it without using that option. |
Thanks for that, I attach the full log, the first error is: |
You need to update your ncl version, elseif was introduced in version 6.5, you're using 6.4. Which is weird cause the environment.yml file specifies ncl>=6.5 |
aha, it picks up 6.4 somehow, (because thats the default on the machine maybe), or something stupid I did, thanks!!! I try with >=6.5 |
OK don't worry about this for now, we have matplotlib pinned to |
It looks like that was just fixed with the release of iris 3.0.4 |
In principle yes, but we noticed that the update to iris 3.0.4 comes with a bunch of (very useful) implications:
so we decided to pin to <3.0.4 for the bugfix release now. Let's discuss how to proceed in a separate issue. |
Thanks a lot @bouweandela for testing all the recipes and posting the reasons for the crashes here! That is nice to have. Following up on this comment, I had a look at all recipes that crashed because of "Missing ERA5 data". Most are hydrological recipes which sometimes require up to 30 years of daily ERA5 data and up to 5 variables. It would require quite some computational effort to cmorize all that (storage would be manageable). Is it really needed to test those recipes over the whole time period to check that these run fine from a technical point of view? The question also holds for recipes using a large number of datasets. Do we really need to run the recipes on all original datasets (including for example missing CMIP5 data) to check that these run fine? I guess most recipe failures that were spotted recently could have been witnessed using a subset of data and a limited time period. It would be great if we could have at some point some kind of a "test mode" for the recipes to allow to check them faster and avoid all the missing data problems listed here. |
Regarding failing recipes because of "missing ERA5 data":
Could the authors of these recipes (maybe @SarahAlidoost, @stefsmeets, @Peter9192?) help me get the missing aux files and dataset in order to be able to test these recipes on Mistral? I'm not sure if the auxiliary files are just stored in another folder that I'm not aware of. The other recipe failures related to "Missing ERA5 data" seem to be only due to missing ERA5 data. |
Thanks for looking into this @remi-kazeroni!
Yes. For example, if it turns out that they need 10 TB of RAM, they do not work
Yes, because the input data is so diverse that this is the only way to make sure everything works. I used 29 compute hours on Mistral for the tests I did in this issue, so the required amount of compute hours seems manageable so far.
You're probably right (cf #2240 (comment)), but if a recipe already fails on missing data, the amount of compute used is quite small because we do check that all required data is available before running any computations.
The documentation of the recipes contains links to where the shapefiles can be downloaded: |
On Mistral we have a shared directory for auxiliary data files (/mnt/lustre02/work/bd0854/DATA/ESMValTool2/AUX), similarly to OBS and RAWOBS. It could be an option to use this common directory to avoid duplications and missing data. Note that /mnt/lustre02/work/bd0854/DATA/ESMValTool2/AUX is mirrored to Jasmin.
|
Great, could you add that to the config-user.yml that is shipped with ESMValCore so people can find it? I think we may run into similar issues as with the Tier3 datasets, for this recipe: https://docs.esmvaltool.org/en/latest/recipes/recipe_carvalhais14nat.html#observations Could you please open new issues for the missing auxiliary data files? Then we can tag the authors of the respective recipes there and hopefully get it solved. |
There is an example here: https://github.com/openstreams/wflow/blob/master/examples/wflow_rhine_sbm/staticmaps/wflow_dem.map although I'm not sure if that's the same region as currently specified in the extract_region preprocessor in the recipe. Agree it would be good to document this, so as soon as there is an issue I can transfer this comment. |
I will close this issue, since we are well passed the 2.3.0 release of ESMValTool. I encourage everyone who has been involved in this issue to see if there are issues that were discovered here and have neither been resolved nor are being tracked in there own ticket yet and to open separate issues for those. For the next round, i.e. 2.4.0, I suggest we try this exercise as a discussion instead of an issue to be able to keep comments together that belong together. Thanks to everyone for the effort! |
Hey good peeps @ESMValGroup/esmvaltool-developmentteam we have released esmvalcore=2.3.0 yesterday but we didn't do a sanity check and run all the recipes from ESMValTool, there were some key changes pushed forward with the release and we have decided to perform this check anyway, just to be sure all's still up and running. So could we please ask you to grab one or more of your favourite recipes and run it with ESMValCore installed from either
main
or PyPi or conda, your choice - only be sure to have pulled the latestmain
and/or the latest package off PyPi/conda. A fresh installation of ESMValTool will bring forth the newly released ESMValCore if you remove the pin- esmvalcore>=2.2.0,<2.3
on 2.3 inenvironment.yml
andsetup.py
. Cheers ever so much! Oh and feel free to use the bot if @nielsdrost could install the latest esmvalcore there? There will be weissbiers as rewards for you at the next (actual, 3D, face-to-face) meeting at DLR 😁 🍺Tier1/ISCCP/clisccp_ISCCP_L3_V1.0_*
on esmeval/JASMIN is slightly dodgy according to esmvaltool's CMOR checks ESMValCore#1238 @valeriupredoiThe text was updated successfully, but these errors were encountered: