Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Portable mode: first draft (Brace yourselves) #491

Draft
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

leogama
Copy link
Contributor

@leogama leogama commented May 31, 2022

This draft is a proof of concept. It is the first complete version of a "portable mode" implementation.

The pickle streams generated by it have the following design:

-> Bootstrap header that contains:
--> Op-codes to create dill and dill._dill empty modules
--> Minimal logic to check if a dill module can be imported and its version
--> The "payload", a compressed pickle stream that is loaded conditionally and contains:
---> A first batch of bootstrapped constructors: _create_code, _create_function, _import_module and _reverse_typemap
---> A second batch of constructors loaded using the constructors from the first batch, with its global vars, etc.

-> Body: the standard pickle stream generated by Pickler.dump

Design considerations:

  • The header must be written before calling Pickler.dump, directly to the file-like object, and have to leave the unpickling stack empty when loaded.
  • To be compatible with cPickle in the future, the header can't use memoization as the memo mapping of cPickle is private (it can just be copied afterwards).
  • The "(un)pickling machine" poses extensive restrictions to conditional execution, there's no looping and no branching by default, and all op-codes are executed and objects created regardless of whether they are used or not.

This is what drove me to use an internal pickle stream (the "payload") instead of putting everything directly in the header:

  • It can use memoization
  • It can be compressed
  • The objects are not created unnecessarily
  • The conditional populating of dill._dill can be implemented as unpickling the payload or an empty pickle stream

Note 1: The code annotation is incomplete and outdated. I may complete it in the next days.

Note 2: Don't mind the changes made for testing purposes, they were just a quick and dirty way to test the portable mode with a variety of objects.

PS: It's a monster, I know.

@leogama
Copy link
Contributor Author

leogama commented May 31, 2022

Example of testing the portable mode (Python 3.8+):

import dill
def square(x):
    return x*x
with open('portable.pkl', 'wb') as file:
    dill.dump(square, file, portable=True)

In a different session:

# If dill is in PYTHONPATH (just works if dill was not already imported):
import importlib, os, sys
if os.path.exists(os.path.join('dill', '__init__.py')):
    sys.path.remove('')
while (spec := importlib.util.find_spec("dill")) is not None:
    sys.path.remove(os.path.dirname(os.path.dirname(spec.origin)))

import pickle
with open('portable.pkl', 'rb') as file:
    square = pickle.load(file)
print(square(17))

import dill
print(*vars(dill))

@leogama leogama force-pushed the bootstrap-constructor branch 3 times, most recently from fd26449 to f8a3459 Compare May 31, 2022 19:03
@leogama
Copy link
Contributor Author

leogama commented May 31, 2022

Travis CI failures:

  • 3.7: syntax error, OK as this is for Py3.8+
  • 3.8 and 3.9: a weak reference error that I don't understand, as there are non-weak references to all objects with weak references until the end of the test function (I can reproduce this error on my machine).
  • 3.10: "module importlib has no attribute util" (??? no idea)
  • 3.11-dev: fails because numpy is not installed, fair. But why 3.8 and 3.10 don't fail earlier because of this? Per the configuration, just 3.9 was supposed to have numpy

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant