-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Split out frequencies, models, and scenarios in regridding pipeline #40
Conversation
Converting to draft, see comment on cmip6-utils PR |
Okay, I think this is ready to try testing again - I've made some improvements that I hope will help things be more robust and prevent the hanging I was seeing with multiprocessing in some steps (or maybe only the batch file generation). Also, see ua-snap/snap-geo#6 - I would appreciate if use this PR as an opportunity to try kicking the tires on the new |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This all looks good to me! As suggested, I tried running the regridding pipeline through Prefect with:
- a single var/freq/model/scenario combo
- two vars/freqs/models/scenarios in a few different combos
- "all" for each
I inspected the qc_error.txt
file (always empty) and visual QC notebook after each run. The random sampling of side-by-side source & regridded images is an excellent idea and they always looked similar & sane.
I ran into the same issue described in #8 a couple times, but no biggie. That's an issue to be tackled separately.
I also tried installing the snap-geo conda environment from the new env_from_history.yml
file on my Mac and it worked! I'll add more details in ua-snap/snap-geo#6
Awesome work!!
@kyleredilla great work! I've ran several flows now with both the |
This PR adds parameters for frequency, model, and scenario to the regridding pipeline flow. This enhancement permits the processing of any single combination of model, scenario, variable, and frequency. The default option is "all" for each of these, meaning all available data.
To test, execute the regrid CMIP6 flow like you would with any other prefect flow. Use /beegfs/CMIP6/arctic-cmip6/CMIP6 as the
cmip6_directory
, and usefix_regrid
for thebranch_name
.Try testing a few different combinations of things, such as
You can test with either a personal directory in Chinook, or the
snapdata
directory.To verify the regridding, you can look into the QC files in
<scratch_directory>/cmip6_regridding/qc/
. You can view theqc_error.txt
fiel to make sure there are no errors written there, and check out the rendered HTML QC notebook that will be saved tovisual_qc_out.html
(I copy to it locally via SCP but there might be a way to view on Chinook?) to check out visuals of the new data.Please share any and all suggestions! This is a critical part of the CMIP6 pipeline. While I anticipate running this less and less frequently, I'm open to all ideas for polishing it and trying to improve the quality of the data product.
closes #20 #41