Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Missing Data in Downscaled CMIP6 Datasets for China Region #323

Open
zhouyunzhouyun opened this issue Jan 31, 2024 · 2 comments
Open

Missing Data in Downscaled CMIP6 Datasets for China Region #323

zhouyunzhouyun opened this issue Jan 31, 2024 · 2 comments

Comments

@zhouyunzhouyun
Copy link

I am writing to seek your assistance regarding an issue I encountered with the CMIP6 downscaled datasets. I have been following the instructions provided in your GitHub repository, specifically the notebook at https://github.com/carbonplan/cmip6-downscaling/blob/main/notebooks/accessing_data_example.ipynb, to download the downscaled CMIP data.

While the process was straightforward, I observed a significant number of missing values in the datasets corresponding to the China region. This issue is critical for my research as it pertains to climate impact studies in this area.

Attached to this question, you will find details of the datasets where these missing values are prevalent. I would greatly appreciate any insights or suggestions you might have on how to address this problem. Are there any alternative sources or methods I can use to obtain a more complete dataset for my research?


Q3
Q2
Q1

@maxrjones
Copy link
Contributor

maxrjones commented Feb 1, 2024

Thanks for raising this issue!

I suspect the missing chunks are caused by network (I/O) errors rather than missing data in storage because we do not see the same missing data when using Planetary Computer, which is in the West Europe region of Microsoft Azure (the same location as the data). We hope to look into whether there are ways to protect against I/O errors influencing workflows within the next couple of months.

As for alternative sources, our research explainer includes references to the other commonly used global downscaled CMIP6 datasets (NASA NEX BCSD and Climate Impact Lab QDM + QPLAD). Running computational workflows close to the data (e.g., on planetary computer) can also minimize the likelihood of network errors.

@maxrjones
Copy link
Contributor

Just noting for future work that I think zarr-developers/zarr-python#489 or something similar will be necessary for fixing these issues

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants