Add Codec unit tests #2035

rabernat · 2024-07-13T21:05:08Z

While working on #2031 I became familiar with the new V3 Codec API and its peculiarities. And I saw that we don't yet have actual unit tests for the codecs. We have some tests in tests/v3/test_codecs/, but I'd call these more end-to-end tests, since they are creating Arrays.

I think it's important for us to unit-test al of the important internal interfaces separately from end-to-end tests. This is particularly important for codecs, so we can guard against data corruption issues.

This PR is a step in that direction.

TODO:

Add decode_partial and encode_partial tests.
Parametrize more variation of input data
Look for opportunities to make these tests simpler / faster (right now there is a combinatorial explosion of possibilities)

rabernat · 2024-07-14T17:04:15Z

One area of feedback on the Codec API: it makes very little sense to me that the Codec API is async. Almost by definition Codecs are blocking, CPU-intensive code. They are not doing I/O. Why should their core methods be async?

It should be the Pipeline's job to dispatch blocking Codecs calls to threads. Not the Codec itself.

d-v-b · 2024-07-20T20:07:47Z

One area of feedback on the Codec API: it makes very little sense to me that the Codec API is async. Almost by definition Codecs are blocking, CPU-intensive code. They are not doing I/O. Why should their core methods be async?

It should be the Pipeline's job to dispatch blocking Codecs calls to threads. Not the Codec itself.

The reason why all codecs need to be async is because sharding is a codec, and the encode / decode operation of the sharding codec requires doing IO.

I would love to see some formal separation between "codecs that read from storage" (i.e., just sharding) and "codecs that transform bytes in memory" (all the other codecs), but I'm not sure what this would look like.

d-v-b

I love tests! thanks for this @rabernat

d-v-b · 2024-10-10T19:53:47Z

oops, I approved without noting that this is a draft. sorry for the noise. consider it a draft approval.

rabernat added 2 commits July 13, 2024 16:59

add codec unit tests

767eb5f

remove failing blosc tests

8610ac0

jhamman added the V3 Affects the v3 branch label Aug 9, 2024

jhamman added this to the After 3.0.0 milestone Oct 1, 2024

d-v-b approved these changes Oct 10, 2024

View reviewed changes

jhamman changed the base branch from v3 to main October 14, 2024 20:57

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add Codec unit tests #2035

Add Codec unit tests #2035

rabernat commented Jul 13, 2024

rabernat commented Jul 14, 2024 •

edited

Loading

d-v-b commented Jul 20, 2024

d-v-b left a comment

d-v-b commented Oct 10, 2024

Add Codec unit tests #2035

Are you sure you want to change the base?

Add Codec unit tests #2035

Conversation

rabernat commented Jul 13, 2024

rabernat commented Jul 14, 2024 • edited Loading

d-v-b commented Jul 20, 2024

d-v-b left a comment

Choose a reason for hiding this comment

d-v-b commented Oct 10, 2024

rabernat commented Jul 14, 2024 •

edited

Loading