Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dataclass as_dict has no way of excluding fields. #120504

Closed
pranaliyawalkar opened this issue Jun 14, 2024 · 2 comments
Closed

Dataclass as_dict has no way of excluding fields. #120504

pranaliyawalkar opened this issue Jun 14, 2024 · 2 comments
Labels
stdlib Python modules in the Lib dir type-feature A feature request or enhancement

Comments

@pranaliyawalkar
Copy link

pranaliyawalkar commented Jun 14, 2024

Bug report

Bug description:

dict_factory is one way of dropping fields from top-level dict to be returned, but that includes expensive copy.deepcopy computations on fields that will eventually get dropped.

def _asdict_inner(obj, dict_factory, exclude_list=[]):
    if _is_dataclass_instance(obj):
        result = []
        for f in fields(obj):
            if f.name is not in exclude_list: 
              value = _asdict_inner(getattr(obj, f.name), dict_factory)
              result.append((f.name, value))
        return dict_factory(result)
    elif isinstance(obj, tuple) and hasattr(obj, '_fields'):
        # obj is a namedtuple.  Recurse into it, but the returned
        # object is another namedtuple of the same type.  This is
        # similar to how other list- or tuple-derived classes are
        # treated (see below), but we just need to create them
        # differently because a namedtuple's __init__ needs to be
        # called differently (see bpo-34363).

        # I'm not using namedtuple's _asdict()
        # method, because:
        # - it does not recurse in to the namedtuple fields and
        #   convert them to dicts (using dict_factory).
        # - I don't actually want to return a dict here.  The main
        #   use case here is json.dumps, and it handles converting
        #   namedtuples to lists.  Admittedly we're losing some
        #   information here when we produce a json list instead of a
        #   dict.  Note that if we returned dicts here instead of
        #   namedtuples, we could no longer call asdict() on a data
        #   structure where a namedtuple was used as a dict key.

        return type(obj)(*[_asdict_inner(v, dict_factory) for v in obj])
    elif isinstance(obj, (list, tuple)):
        # Assume we can create an object of this type by passing in a
        # generator (which is not true for namedtuples, handled
        # above).
        return type(obj)(_asdict_inner(v, dict_factory) for v in obj)
    elif isinstance(obj, dict):
        return type(obj)((_asdict_inner(k, dict_factory),
                          _asdict_inner(v, dict_factory))
                         for k, v in obj.items())
    else:
        return copy.deepcopy(obj)

The above fix of passing exclude_list on the first call to _asdict_inner will work.

CPython versions tested on:

3.11

Operating systems tested on:

Linux

@pranaliyawalkar pranaliyawalkar added the type-bug An unexpected behavior, bug, or error label Jun 14, 2024
@sobolevn sobolevn added type-feature A feature request or enhancement stdlib Python modules in the Lib dir and removed type-bug An unexpected behavior, bug, or error labels Jun 14, 2024
@sobolevn
Copy link
Member

  1. What if you have nested classes:
@dataclass
class A:
    foo: str
    bar: str

@dataclass
class B:
    a: A

data = as_dict(B())

And you want to drop a.foo here? I think this would be a natural extension to the proposed feature.

  1. How bad it actually is performance-wise on real data? How much faster it would be with exclude_fields?
  2. Why exclude_fields is [] by deafult? It should be None by default and documented as set, probably

I think that this needs a discussion first on https://discuss.python.org/c/ideas/6

@ericvsmith
Copy link
Member

I agree this should be discussed on https://discuss.python.org/c/ideas/6 first. @pranaliyawalkar : I'm going to close this until it's discussed there and a conclusion is reached. If need be, this can be re-opened.

My general stance is that adding as_dict was probably a mistake. There are just too many options on how it could work, and you should probably write your own version to suit your needs.

@ericvsmith ericvsmith closed this as not planned Won't fix, can't repro, duplicate, stale Jun 14, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
stdlib Python modules in the Lib dir type-feature A feature request or enhancement
Projects
None yet
Development

No branches or pull requests

3 participants