Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Fix multiprocessing and consolidate QC (#68)
* move multiprocessing out of for loop * add qc_config and job array to qc sbatch * add print statement to track file names/times * use actual variable count in sbatch params * Combine qc script and notebook and simplify code * drop refs to visual qc for runner script * make qc scripts and notebook consistent * small fixes for regridding qc * remove unused args in qc runner * pull subsampling code into qc module * checkpoint for script to combine regridded data for rasdaman * finalize script to combine regridded files for rasdaman * fix regridding batch files script to handle MPI-M institution ID * add empty variables if missing * remove rasdaman preprocessing script for monthly common cmip6 * remove unused dict from regridding config * print job_id for prefect ssh to parse * try command in place of conda_init_script * print job IDs for regrid runner * print list of job ids as space-separated string * drop crop from target dataset * ensure lon dim is 1D when sorting * disable tryexcept for regrid call * check for latlon dims before fixing * add interp method as top level parameter * fix interp_method top level parameter * add missing kwarg * fix positional arg * fix script arg * fix script arg * print regrid qc slurm job id * fix regrid qc sbatch script * drop ref to error file * regridding qc overhaul for generic target grid * clean up regrid qc nb * drop bnds variables first in rasdafy * fix longitude shift for 0-360 src files * Add fixed frequency variables to transfers pipeline (#70) * stop skipping fx and Ofx frequencies when generating batch files * generate new batch files with fx and Ofx frequencies / new fixed frequency variables * use "1950" placeholder start year and end year values in grid dict if time dimension is missing from dataset * explicitly skip fx, Ofx, orog variables in regridding time correction functions * look for sftlf, sftof var names instead of freqs in filename * + documentation * transfer E3SM fixed frequency files * re-run e3sm holdings, fix messaging in generate_manifest.py and update the manifest; start to add specific additional files to config * add one-off files to config and generate new manifest; add "piControl" experiment to transfer path in batch file generation * generate batch files * import missing argparse module * import missing sys module * import missing os and upath modules * rm upath module actually, not available in cmip6-utils env * remove unused code --------- Co-authored-by: kyleredilla <[email protected]> * add soil temp variable * add check for lat var before trnaposing * add soil temperature to transfers pipeline * str tweak for fixed freq vars in regrid script * add rsus, rlus, and mlotst to transfers config * add landsea mask functionality to regrid script * drop unneeded regrid script arg * add sftlf arg to regrid slurm script * add no files error to regrid batch gen script * add x and y as dims in regrid batch file generation * add trycatch for sftlf lookup in regrid job generation * increase tolerance for native nanmask * tweak qc script for land/sea variables * transpose plots and add error handling for file opening for regrid qc * handle use of full lat/lon names in coordinate vars by some models * fix duplicate nanmin in regrid batch file gen * add conversion of snw variable * expand nan fraction thresholds for using native file nanmask * add ignore_degenerate arg to regridder * makde regridding qc subsetting more robust for nonstandard grids * update regridding config * remove Omon prsn from transfers * add script to remove old versions of duplicate raw cmip6 files * updates to improve regridding qc * add function to drop empty directories in cleanout script * remove prsn as landsea variable 🤦 * use nan funcs for regrid qc plot comparisons * clean up regrid qc code and fix up docs * clean up regridding pipeline including docs * typo fix * improve time dim encoding in regrid files * fix units and dims in plotting in regrid qc * update readme * bump regridding QC time limit to 2 hours * tweaks to fix regridding for 360 day calendars * add missing code for rasdafy arg * de-duplicate function name * fix regrid bounding box fetch for qc and add buffer for file min max * fix regrid to source bbox conversion in qc * improve regrid qc plotting * add missing arg in regrid qc function * remove holdings output text files --------- Co-authored-by: Joshdpaul <[email protected]> Co-authored-by: Josh Paul <[email protected]>
- Loading branch information