feat(zarr-metadata): partial metadata types#3982
Conversation
Plan for adding ArrayMetadataV3Partial, GroupMetadataV3Partial, ArrayMetadataV2Partial, and GroupMetadataV2Partial to zarr-metadata as siblings of the existing full TypedDicts, plus an equivalence test that prevents drift.
Sibling TypedDict to ArrayMetadataV3 with total=False, intended for typing dicts that intentionally hold a subset of a complete v3 array metadata document (test fixtures, fragment templates). Drift between the two is prevented by a new equivalence test.
Sibling TypedDict to GroupMetadataV3 with total=False.
Sibling TypedDict to ArrayMetadataV2 with total=False.
Sibling TypedDict to GroupMetadataV2 with total=False. Added for symmetry with the other v2/v3 *Partial types; the only practical difference from GroupMetadataV2 is that zarr_format becomes optional.
…types After landing all four *Partial classes, the docstrings had drifted into two shapes: ArrayMetadataV2Partial / GroupMetadataV2Partial used the self-contained form (summary → use-case → NotRequired rationale → drift), while GroupMetadataV3Partial deferred its use-case paragraph to ArrayMetadataV3Partial, and ArrayMetadataV3Partial put its drift sentence before the NotRequired rationale. Standardize on the self-contained form with consistent paragraph order so each Partial reads on its own and any future prose edit only needs to touch the one class it concerns.
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #3982 +/- ##
=======================================
Coverage 93.34% 93.34%
=======================================
Files 88 88
Lines 11819 11819
=======================================
Hits 11033 11033
Misses 786 786 🚀 New features to boost your workflow:
|
|
thank you max! |
chuckwondo
left a comment
There was a problem hiding this comment.
@d-v-b, I'm still not convinced these are necessary. Using zarr-rs as a reference point, offhand I don't see any such equivalents. Am I missing something?
I agree that these are not strictly necessary! they are just a convenience that zarr-python will only use in tests, to ensure that we can properly annotate fragments of metadata documents. but I anticipate that other libraries might find these partial types useful in other applications. |
That's part of my point though. I don't think these are necessary for tests either. Rather, I'm suggesting that the tests should be written differently. I'm failing to see what you feel cannot be properly tested without these types? Why do you think they're needed here, when they are not needed in zarr-rs (as far as I can tell, so please correct me if I missed something). In my view, tests should generally be reflective of real usage, and these types should not be available to users, so I'm still not seeing their necessity. Again, perhaps I'm missing something. I feel that these additional types may simply add undue clutter/complexity/maintenance overhead and potential for confusion/cognitive overload should users discover them. |
There's nothing that can't be properly tested without these types. These types are helpful for any test that consumes a fragment of a metadata document, like this one. But these types are required for making this API type safe: @dataclass
class ArrayMetadata:
shape: tuple[int, ...]
# remaining array metadata keys as data class fields
...
def update(self, **changes: Unpack[ArrayMetadataV3Partial]) -> Self: ...The above API is a very sane API we should support. A single |
|
I'm still not convinced, but we can conclude the conversation for now, as I suspect I still have significant gaps in my understanding that I need to fill before I can see things more clearly. My lack of being convinced at this point is based on my general development experience, not my specific (lack of) understanding of the zarr ecosystem, so gaining more knowledge of it may very well change my perspective here. Thanks for your patience with my push backs 😅 |
|
no worries! I could also be wrong, and maybe these types are useless (or python gets a |
|
fwiw I held some of the same skepticism as @chuckwondo, but think we should value experimentation in the metadata library to try different approaches and iterate faster towards a stable v1 API. IMO we do not need to have the same standard as in the core library that everything should be undeniably useful. That tolerance for experimentation is part value of using this monorepo approach. |
Adds partial forms of ArrayMetadataV2, ArrayMetadataV3, GroupMetadataV2, GroupMetadataV3 typeddicts, i.e. typeddicts where all keys are optional. Today, these types can be consumed by zarr-python test code (in a follow-up PR). In the future, if we every want to expose type-safe APIs for updating array / group metadata, then these types would be useful there.