Split out frequencies, models, and scenarios in regridding pipeline #40

kyleredilla · 2024-07-26T00:32:59Z

This PR adds parameters for frequency, model, and scenario to the regridding pipeline flow. This enhancement permits the processing of any single combination of model, scenario, variable, and frequency. The default option is "all" for each of these, meaning all available data.

To test, execute the regrid CMIP6 flow like you would with any other prefect flow. Use /beegfs/CMIP6/arctic-cmip6/CMIP6 as the cmip6_directory, and use fix_regrid for the branch_name.

Try testing a few different combinations of things, such as

a single frequency, model, scenario, and variable;
then maybe 2 or more for each,
then the default "all" for each, to run the whole shebang.

You can test with either a personal directory in Chinook, or the snapdata directory.

To verify the regridding, you can look into the QC files in <scratch_directory>/cmip6_regridding/qc/. You can view the qc_error.txt fiel to make sure there are no errors written there, and check out the rendered HTML QC notebook that will be saved to visual_qc_out.html (I copy to it locally via SCP but there might be a way to view on Chinook?) to check out visuals of the new data.

Please share any and all suggestions! This is a critical part of the CMIP6 pipeline. While I anticipate running this less and less frequently, I'm open to all ideas for polishing it and trying to improve the quality of the data product.

closes #20 #41

kyleredilla · 2024-07-30T18:00:06Z

Converting to draft, see comment on cmip6-utils PR

kyleredilla · 2024-08-08T16:49:50Z

Okay, I think this is ready to try testing again - I've made some improvements that I hope will help things be more robust and prevent the hanging I was seeing with multiprocessing in some steps (or maybe only the batch file generation).

Also, see ua-snap/snap-geo#6 - I would appreciate if use this PR as an opportunity to try kicking the tires on the new snap-geo conda environment! I have been testing with it and it seems to be working.

cstephen

This all looks good to me! As suggested, I tried running the regridding pipeline through Prefect with:

a single var/freq/model/scenario combo
two vars/freqs/models/scenarios in a few different combos
"all" for each

I inspected the qc_error.txt file (always empty) and visual QC notebook after each run. The random sampling of side-by-side source & regridded images is an excellent idea and they always looked similar & sane.

I ran into the same issue described in #8 a couple times, but no biggie. That's an issue to be tackled separately.

I also tried installing the snap-geo conda environment from the new env_from_history.yml file on my Mac and it worked! I'll add more details in ua-snap/snap-geo#6

Awesome work!!

charparr · 2024-08-15T15:11:05Z

@kyleredilla great work! I've ran several flows now with both the cmip6-utils and the snap-geo env and examined the QC results and they look great. I left a few minor comments but feel free to take 'em / leave 'em prior to merging. This PR seems good to go, nice job!

regridding/luts.py

regridding/regridding_functions.py

kyleredilla added 2 commits July 25, 2024 14:58

add freqs, models, and scenarios to flows

a744f66

add missing args in script calls

3b91db5

kyleredilla mentioned this pull request Jul 26, 2024

Fix regridding pipeline and improve granularity ua-snap/cmip6-utils#59

Merged

kyleredilla requested a review from cstephen July 26, 2024 00:33

kyleredilla marked this pull request as draft July 30, 2024 18:00

add conda env name as flow parameter and wire in for tasks

0d3c97c

kyleredilla marked this pull request as ready for review August 8, 2024 16:09

cstephen approved these changes Aug 12, 2024

View reviewed changes

kyleredilla requested a review from charparr August 12, 2024 18:56

kyleredilla mentioned this pull request Aug 13, 2024

Add full variable names to regridding luts via comments #41

Open

add variable names to luts for reference

112b747

charparr approved these changes Aug 15, 2024

View reviewed changes

regridding/luts.py Show resolved Hide resolved

regridding/regridding_functions.py Show resolved Hide resolved

regridding/regridding_functions.py Show resolved Hide resolved

kyleredilla added 3 commits August 15, 2024 08:53

Add readme with CMIP6 variable ID reference table

2321d9e

fix typos

79ce2c6

consistent usage of underscores for unused unpacks

9005fe3

kyleredilla merged commit 7f54131 into main Aug 15, 2024

kyleredilla deleted the split_freqs branch August 15, 2024 17:04

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Split out frequencies, models, and scenarios in regridding pipeline #40

Split out frequencies, models, and scenarios in regridding pipeline #40

kyleredilla commented Jul 26, 2024 •

edited

Loading

kyleredilla commented Jul 30, 2024

kyleredilla commented Aug 8, 2024

cstephen left a comment

charparr commented Aug 15, 2024

Split out frequencies, models, and scenarios in regridding pipeline #40

Split out frequencies, models, and scenarios in regridding pipeline #40

Conversation

kyleredilla commented Jul 26, 2024 • edited Loading

kyleredilla commented Jul 30, 2024

kyleredilla commented Aug 8, 2024

cstephen left a comment

Choose a reason for hiding this comment

charparr commented Aug 15, 2024

kyleredilla commented Jul 26, 2024 •

edited

Loading