Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

gh-129250: allow pickle instances of generic classes #129446

Open
wants to merge 5 commits into
base: main
Choose a base branch
from

Conversation

tom-pytel
Copy link
Contributor

@tom-pytel tom-pytel commented Jan 29, 2025

This is a maybe PR, not sure if is not overkill for original issue #129250 which may be too niche?

Fixes OP issue by allowing pickling of anonymous type parameters - those which do not exist at module scope but are defined through generic class and function syntax class cls[T]: ... or def func[T](): .... This is done by adding a weakref owner variable which is set on creation for anonymous type params and provides a context for pickling so that on unpickle the type param can be recreated identically. TypeVar, ParamSpec and TypeVarTuple are modifed to work in this manner. Owner is only ever set at creation for type params which have __module__ == typing and the name either does not exist in typing or if it does is not exactly this typeparam.

If this is accepted as the way to go, then here are some questions:

  • TypeAlias does not currently support weak references for some reason so can't use this method to pickle the anon typeparams in type t[A, *B, **C] = .... This is probably not a problem as a TypeAlias can't instantiate objects with its anon typeparams like a func or class can? Can add support anyways now if weakrefs on TypeAlias will be added at some point in the future?

  • What to do on explicit assign to __type_params__, add owner to assigned typeparams where applicable? Remove from old?

  • Only path supported to anonymous type param is sequence index in owner __type_params__, are other paths possible/desired?

  • If builtin___build_class__() is not a good place to set new class typeparams owner, there are other options.

goto done;
}

args = PyTuple_New(2);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can use PyTuple_Pack.

return 0;
}

PyObject *typing = PyImport_ImportModule("typing");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We shouldn't need to do this, we know when type params come directly from a PEP 695 generic.

An alternative implementation approach could be to save some information on TypeVar creation (when INTRINSIC_TYPEVAR is called). At that point, we don't have the function object yet, but we know what its module, qualname, and index are going to be, so we could store that information on the TypeVar, which should be enough to unpickle it.

Copy link
Contributor Author

@tom-pytel tom-pytel Jan 30, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed the weakref and owner is now stored internally as (__module__, __qualname__, index). The object is looked up at pickle time using this and serialized and deserialized the same way as before.

>>> class cls:
...     class sub[T]: pass
...
>>> cls.sub.__type_params__[0].__reduce__()
(<function _restore_anonymous_typeparam at 0x7f372d712c90>, (<class '__main__.cls.sub'>, 0))

For now didn't go full out and get the __module__ (from globals), __qualname__ (from code object maybe) and index (set by compiler) but rather kept the same post-process step to set owner for typeparams in __tuple_params__. This happens after class setup code is executed for example so would that be a problem?

For functions, _Py_set_type_params_owner() happens in _Py_set_function_type_params() as before but for classes added an intrinsic and some bytecode after the call to the class setup code (no more modification to builtin___build_class__):

  Disassembly of <code object <generic parameters of cls> ...
  ...
  1:0-2:8            LOAD_DEREF               2 (.type_params)
  1:0-2:8            CALL_INTRINSIC_1        10 (INTRINSIC_SUBSCRIPT_GENERIC)
  1:0-2:8            STORE_FAST_LOAD_FAST    17 (.generic_base, .generic_base)
  1:0-2:8            CALL                     3
+ 1:0-2:8            COPY                     1
+ 1:0-2:8            CALL_INTRINSIC_1        12 (INTRINSIC_SET_TYPE_PARAMS_OWNER)
+ 1:0-2:8            POP_TOP
  1:0-2:8            RETURN_VALUE

Is less bytecode than adding owner once per call to INTRINSIC_TYPEVAR and getting those values individually beforehand (which looks to be a bit complicated, neither func __qualname__ nor code object are available yet in compiler at time of INTRINSIC_TYPEVAR code gen). Doing that would still require another intrinsic or at least modifications to INTRINSIC_TYPEVAR, INTRINSIC_TYPEVARTUPLE and INTRINSIC_PARAMSPEC individually. Let me know if that method is preferred.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants