Document Suggested Workflow #40

ChrisBarker-NOAA · 2024-06-26T22:42:58Z

This code is focused on a specific part of the workflow folks may need to do -- but we are also provided tools and utilities for other bits. So I think it's helpful to Document the suggested workflow, and that will also help us determine where to put code.

My first draft:

Goal:

Starting Point:

User has a set of data that can be loaded into xarray: could be files on disk, or files on AMS, or Kerchunked zarr dataset, or ....

User needs a subset of that data:

Restricted to:
- a polygon in space
- particular time frame
- either a single vertical layer or all vertical layers (proper vertical subsetting can wait ...)
- only the variables they need.

Outcome:

An xarray Dataset all ready to save to netcdf, or .....

That Dataset contains only what the user wants -- and is as similar as the original as possible. e.g. same names for all variables, maybe some additional metadata.

Workflow:

Step One:

User does any pre-processing required to get their data into a single, conforming dataset.

In many cases, there's nothing to be done, but it some cases, there may be work to be done:

The grid and dat variables are in multiple files, they need to be combined into one dataset
If there are "troublesome" variables -- e.g. time coordinates that aren't correct, etc.

As a rule, this will be model specific, maybe even implementation-of-model specific.

This package can't provide all of that, but it can (and should) provide a few examples for common cases.

e.g. SCHISM (STOFS), maybe FVCOM fixing teh time variable (some use single precision float days :-()

Step 2:

The user processes the Dataset to make it CF compliant (or enough so that the subsetting code can work)

This package will contain utilities to do that, e.g.

ugrid.assign_ugrid_topology()

Step 3:

The Dataset can be queried by the user to find out what they need to know in order to specify a subset:

what variables are in the dataset
what timespan is covered
what region is covered (maybe?)
whether it's 2D or 3D ?

Step 4:

The user makes a request for a subset.

Result -- a subset Dataset.

The text was updated successfully, but these errors were encountered:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Document Suggested Workflow #40

Document Suggested Workflow #40

ChrisBarker-NOAA commented Jun 26, 2024

Document Suggested Workflow #40

Document Suggested Workflow #40

Comments

ChrisBarker-NOAA commented Jun 26, 2024

My first draft:

Goal:

Starting Point:

Outcome:

Workflow:

Step One:

Step 2:

Step 3:

Step 4: