Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error when loading Zarr chunk missing. #9701

Closed
alvarosg opened this issue Oct 31, 2024 · 1 comment
Closed

Error when loading Zarr chunk missing. #9701

alvarosg opened this issue Oct 31, 2024 · 1 comment

Comments

@alvarosg
Copy link

Is your feature request related to a problem?

When opening a Zarr dataset with xarray.open_zarr and then calling compute on a slice, if a given Zarr chunk file is missing the data comes back filled with nan's. This is problematic in some cases as it makes it impossible to distinguish whether the nan's are legit nans in the data, or are a result of missing chunk file. Also checking for nan's on large arrays is expensive.

Describe the solution you'd like

Ideally, when trying to call compute on a slice of data from a Zarr datasets for which a chunk is missing, there should be an option that by default raises an error if a chunk file is missing.

For example:

dataset = xarray.open_dataset("path_to_zarr_with_missing_chunk_for_2021-01-02.zarr", error_on_missing_chunks)

data_slice = dataset.sel(time="2021-01-01")
data_slice.compute()

data_slice = dataset.sel(time="2021-01-02")
data_slice.compute(). # Raises MissingChunkError("Could not retrieve data. At least one chunk for the selected slice is missing")

Describe alternatives you've considered

No response

Additional context

No response

@dcherian
Copy link
Contributor

This is an upstream issue. Zarr is returning a chunk with all values as fill_value.
zarr-developers/zarr-python#486

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants