Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix multiprocessing and consolidate QC #68

Merged
merged 80 commits into from
Feb 19, 2025
Merged
Show file tree
Hide file tree
Changes from 16 commits
Commits
Show all changes
80 commits
Select commit Hold shift + click to select a range
82eaee6
move multiprocessing out of for loop
Joshdpaul Aug 30, 2024
f879224
add qc_config and job array to qc sbatch
Joshdpaul Aug 31, 2024
f92c051
add print statement to track file names/times
Joshdpaul Aug 31, 2024
0c1a6b7
use actual variable count in sbatch params
Joshdpaul Sep 3, 2024
7bf8b11
Combine qc script and notebook and simplify code
kyleredilla Sep 9, 2024
5896faa
drop refs to visual qc for runner script
kyleredilla Sep 9, 2024
67f9f9f
make qc scripts and notebook consistent
kyleredilla Sep 9, 2024
db5ce3d
small fixes for regridding qc
kyleredilla Sep 12, 2024
8a80efe
remove unused args in qc runner
kyleredilla Sep 12, 2024
bdc7ef3
pull subsampling code into qc module
kyleredilla Sep 13, 2024
60dc1b4
checkpoint for script to combine regridded data for rasdaman
kyleredilla Sep 14, 2024
1aadf51
finalize script to combine regridded files for rasdaman
kyleredilla Sep 18, 2024
f09963a
fix regridding batch files script to handle MPI-M institution ID
kyleredilla Sep 20, 2024
6001ac5
add empty variables if missing
kyleredilla Sep 20, 2024
9b90485
remove rasdaman preprocessing script for monthly common cmip6
kyleredilla Oct 1, 2024
0cce56a
remove unused dict from regridding config
kyleredilla Oct 2, 2024
aa91734
print job_id for prefect ssh to parse
kyleredilla Dec 10, 2024
7172b07
try command in place of conda_init_script
kyleredilla Dec 10, 2024
89c2de7
print job IDs for regrid runner
kyleredilla Dec 10, 2024
6bbf6a0
print list of job ids as space-separated string
kyleredilla Dec 11, 2024
369dfe6
drop crop from target dataset
kyleredilla Dec 11, 2024
7cdfe9d
ensure lon dim is 1D when sorting
kyleredilla Dec 11, 2024
e635d47
disable tryexcept for regrid call
kyleredilla Dec 11, 2024
22f4675
check for latlon dims before fixing
kyleredilla Dec 11, 2024
6725514
add interp method as top level parameter
kyleredilla Dec 12, 2024
c06a64b
fix interp_method top level parameter
kyleredilla Dec 12, 2024
81ef5d1
add missing kwarg
kyleredilla Dec 12, 2024
31593b2
fix positional arg
kyleredilla Dec 12, 2024
407ab7e
fix script arg
kyleredilla Dec 12, 2024
16a28b0
fix script arg
kyleredilla Dec 12, 2024
78adf16
print regrid qc slurm job id
kyleredilla Dec 12, 2024
3ca121f
fix regrid qc sbatch script
kyleredilla Dec 12, 2024
33b8b13
drop ref to error file
kyleredilla Dec 12, 2024
7564311
regridding qc overhaul for generic target grid
kyleredilla Dec 12, 2024
b94848b
clean up regrid qc nb
kyleredilla Dec 13, 2024
be821c2
drop bnds variables first in rasdafy
kyleredilla Dec 13, 2024
fc14494
fix longitude shift for 0-360 src files
kyleredilla Dec 15, 2024
7d30584
Add fixed frequency variables to transfers pipeline (#70)
Joshdpaul Dec 16, 2024
e87b367
add soil temp variable
kyleredilla Dec 20, 2024
89b9b0c
add check for lat var before trnaposing
kyleredilla Dec 21, 2024
e879b4a
add soil temperature to transfers pipeline
kyleredilla Dec 18, 2024
442a2e7
str tweak for fixed freq vars in regrid script
kyleredilla Dec 23, 2024
ecbddca
add rsus, rlus, and mlotst to transfers config
Joshdpaul Jan 6, 2025
e8eb51d
add landsea mask functionality to regrid script
kyleredilla Jan 7, 2025
0292d0e
drop unneeded regrid script arg
kyleredilla Jan 7, 2025
447a055
add sftlf arg to regrid slurm script
kyleredilla Jan 7, 2025
0ee21b5
add no files error to regrid batch gen script
kyleredilla Jan 7, 2025
dbcbe7c
add x and y as dims in regrid batch file generation
kyleredilla Jan 7, 2025
05974b3
add trycatch for sftlf lookup in regrid job generation
kyleredilla Jan 7, 2025
212c587
increase tolerance for native nanmask
kyleredilla Jan 8, 2025
4fbe4c0
tweak qc script for land/sea variables
kyleredilla Jan 8, 2025
8f82f8c
transpose plots and add error handling for file opening for regrid qc
kyleredilla Jan 8, 2025
e3bf4b5
handle use of full lat/lon names in coordinate vars by some models
kyleredilla Jan 8, 2025
899a6cc
fix duplicate nanmin in regrid batch file gen
kyleredilla Jan 8, 2025
700e073
add conversion of snw variable
kyleredilla Jan 13, 2025
0d0d8e4
expand nan fraction thresholds for using native file nanmask
kyleredilla Jan 13, 2025
ad3cab0
add ignore_degenerate arg to regridder
kyleredilla Jan 13, 2025
1f03ca0
makde regridding qc subsetting more robust for nonstandard grids
kyleredilla Jan 18, 2025
048a32c
update regridding config
kyleredilla Jan 23, 2025
9383a9f
remove Omon prsn from transfers
kyleredilla Jan 23, 2025
158b2a2
add script to remove old versions of duplicate raw cmip6 files
kyleredilla Jan 24, 2025
dd2e99a
updates to improve regridding qc
kyleredilla Jan 24, 2025
3de63dd
add function to drop empty directories in cleanout script
kyleredilla Jan 24, 2025
be339d5
remove prsn as landsea variable :facepalm:
kyleredilla Jan 25, 2025
61663d8
use nan funcs for regrid qc plot comparisons
kyleredilla Jan 25, 2025
64afc34
clean up regrid qc code and fix up docs
kyleredilla Jan 25, 2025
56394bd
clean up regridding pipeline including docs
kyleredilla Jan 28, 2025
4694994
typo fix
kyleredilla Jan 29, 2025
049eac8
improve time dim encoding in regrid files
kyleredilla Jan 29, 2025
a20dc6a
fix units and dims in plotting in regrid qc
kyleredilla Jan 31, 2025
928fa29
update readme
kyleredilla Jan 31, 2025
4c64df9
bump regridding QC time limit to 2 hours
kyleredilla Jan 31, 2025
1c9e33c
tweaks to fix regridding for 360 day calendars
kyleredilla Jan 31, 2025
9ed7e84
add missing code for rasdafy arg
kyleredilla Feb 10, 2025
a2e477e
de-duplicate function name
kyleredilla Feb 10, 2025
41ae4ac
fix regrid bounding box fetch for qc and add buffer for file min max
kyleredilla Feb 11, 2025
4c3ec4b
fix regrid to source bbox conversion in qc
kyleredilla Feb 12, 2025
cc4d9d9
improve regrid qc plotting
kyleredilla Feb 13, 2025
4129940
add missing arg in regrid qc function
kyleredilla Feb 13, 2025
1e45c1c
remove holdings output text files
kyleredilla Feb 13, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
15 changes: 14 additions & 1 deletion regridding/generate_batch_files.py
Original file line number Diff line number Diff line change
Expand Up @@ -210,6 +210,19 @@ def chunk_fp_list(df, max_size, max_count):
return


def get_institution_id(model, scenario):
"""This should just be a simple lookup, however there is the oddity of MPI-ESM1-2-HR having different institution IDs for historical and SSP data"""
if model == "MPI-ESM1-2-HR":
if scenario == "historical":
inst = "MPI-M"
else:
inst = "DKRZ"
else:
inst = model_inst_lu[model]

return inst


def parse_args():
"""Parse some arguments"""
parser = argparse.ArgumentParser(description=__doc__)
Expand Down Expand Up @@ -282,8 +295,8 @@ def parse_args():
for var in vars.split():
for freq in freqs.split():
for model in models.split():
inst = model_inst_lu[model]
for scenario in scenarios.split():
inst = get_institution_id(model, scenario)
fps.extend(
list(
cmip6_dir.joinpath(exp_id, inst, model, scenario).glob(
Expand Down
Loading