Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ABI break] Add new structs with version info and readonly flag #101

Closed
wants to merge 5 commits into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,7 @@ cmake_minimum_required(VERSION 3.2 FATAL_ERROR)
# Set variables:
# * PROJECT_NAME
# * PROJECT_VERSION
project(dlpack VERSION 0.6 LANGUAGES C CXX)
project(dlpack VERSION 0.7.0 LANGUAGES C CXX)

#####
# Change the default build type from Debug to Release, while still
Expand Down
32 changes: 1 addition & 31 deletions NEWS.md
Original file line number Diff line number Diff line change
@@ -1,31 +1 @@
DLPack Change Log
=================

This file records the changes in DLPack in reverse chronological order.

## v0.4

- OpaqueHandle type
- Complex support
- Rename DLContext -> DLDevice
- This requires dependent frameworks to upgrade the type name.
- The ABI is backward compatible, as it is only change of constant name.

## v0.3

- Add bfloat16
- Vulkan support


## v0.2
- New device types
- kDLMetal for Apple Metal device
- kDLVPI for verilog simulator memory
- kDLROCM for AMD GPUs
- Add prefix DL to all enum constant values
- This requires dependent frameworks to upgrade their reference to these constant
- The ABI is compatible, as it is only change of constant name.
- Add DLManagedTensor structure for borrowing tensors

## v0.1
- Finalize DLTensor structure
This file has moved to [doc/source/release_notes.rst](/doc/source/release_notes.rst).
Binary file modified docs/source/_static/images/DLPack_diagram.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
18 changes: 18 additions & 0 deletions docs/source/c_api.rst
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,8 @@ Macros

.. doxygendefine:: DLPACK_VERSION

.. doxygendefine:: DLPACK_ABI_VERSION

.. doxygendefine:: DLPACK_DLL

Enumerations
Expand All @@ -28,6 +30,22 @@ Structs
.. doxygenstruct:: DLDataType
:members:

.. doxygenstruct:: DLPackVersion
:members:

.. doxygenstruct:: DLTensorVersioned
:members:

.. doxygenstruct:: DLManagedTensorVersioned
:members:

ABI v1 Structs
~~~~~~~~~~~~~~

DLTensor and DLManagedTensor don't contain any field to export version info.
Since ABI version 2, structs DLTensorVersioned and DLManagedTensorVersioned
have been added with version info and should be used instead.

.. doxygenstruct:: DLTensor
:members:

Expand Down
2 changes: 1 addition & 1 deletion docs/source/conf.py
Original file line number Diff line number Diff line change
Expand Up @@ -22,7 +22,7 @@
author = 'DLPack contributors'

# The full version, including alpha/beta/rc tags
release = '0.6.0'
release = '0.7.0'


# -- General configuration ---------------------------------------------------
Expand Down
1 change: 1 addition & 0 deletions docs/source/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -73,6 +73,7 @@ DLPack Documentation

c_api
python_spec
release_notes


Indices and tables
Expand Down
112 changes: 84 additions & 28 deletions docs/source/python_spec.rst
Original file line number Diff line number Diff line change
Expand Up @@ -16,10 +16,12 @@ The array API will offer the following syntax for data interchange:
1. A ``from_dlpack(x)`` function, which accepts (array) objects with a
``__dlpack__`` method and uses that method to construct a new array
containing the data from ``x``.
2. ``__dlpack__(self, stream=None)`` and ``__dlpack_device__`` methods on the
array object, which will be called from within ``from_dlpack``, to query
what device the array is on (may be needed to pass in the correct
stream, e.g. in the case of multiple GPUs) and to access the data.
2. ``__dlpack__(self, stream: int = None, version: int = None)``,
``__dlpack_info__(self)``, and ``__dlpack_device__(self)`` methods
on the array object, which will be called from within ``from_dlpack``,
to access the data, to get the maximum supported DLPack version, and
to query what device the array is on (may be needed to pass in the
correct stream, e.g. in the case of multiple GPUs).
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you break this into 3 separate points? They can be subpoints of 2



Semantics
Expand Down Expand Up @@ -48,6 +50,14 @@ producer; the producer must synchronize or wait on the stream when necessary.
In the common case of the default stream being used, synchronization will be
unnecessary so asynchronous execution is enabled.

A DLPack version can be requested by passing the ``version`` keyword. The
consumer should call the ``__dlpack_info__`` method to get the maximum
DLPack version supported by the producer and request for a version both
support e.g. ``min(producer_version, consumer_version)``. If the consumer
doesn't support any version below the producer's maximum version, a
``BufferError`` should be raised. Similarly, If the producer doesn't
support the requested version, it should raise a ``BufferError``.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What happens if the consumer does not specify a version? As I understand things, for the forseeable future the producer should return a V1 structure. So effectively the default is 1.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What happens if the consumer does not specify a version? As I understand things, for the forseeable future the producer should return a V1 structure. So effectively the default is 1.

Yes, the default is 1. I will update the spec to say that instead.


Implementation
~~~~~~~~~~~~~~
Expand All @@ -65,25 +75,31 @@ struct members, gray text enum values of supported devices and data
types.*

The ``__dlpack__`` method will produce a ``PyCapsule`` containing a
``DLManagedTensor``, which will be consumed immediately within
``from_dlpack`` - therefore it is consumed exactly once, and it will not be
visible to users of the Python API.

The producer must set the ``PyCapsule`` name to ``"dltensor"`` so that
it can be inspected by name, and set ``PyCapsule_Destructor`` that calls
the ``deleter`` of the ``DLManagedTensor`` when the ``"dltensor"``-named
capsule is no longer needed.

The consumer must transer ownership of the ``DLManangedTensor`` from the
capsule to its own object. It does so by renaming the capsule to
``"used_dltensor"`` to ensure that ``PyCapsule_Destructor`` will not get
called (ensured if ``PyCapsule_Destructor`` calls ``deleter`` only for
capsules whose name is ``"dltensor"``), but the ``deleter`` of the
``DLManagedTensor`` will be called by the destructor of the consumer
library object created to own the ``DLManagerTensor`` obtained from the
``DLManagedTensorVersioned`` (or a ``DLManagedTensor``) that is
compatible with the DLPack and DLPack ABI version requested by the
consumer. It will be consumed immediately within ``from_dlpack`` -
therefore it is consumed exactly once, and it will not be visible
to users of the Python API.

The producer must set the ``PyCapsule`` name to ``"dltensor"`` so
that it can be inspected by name, and set ``PyCapsule_Destructor``
that calls the ``deleter`` of the ``DLManagedTensorVersioned`` (or
``DLManagedTensor``) when the ``"dltensor"``-named capsule is no
longer needed.

The consumer must transfer ownership of the ``DLManangedTensorVersioned``
(or ``DLManangedTensor``) from the capsule to its own object. It does so
by renaming the capsule to ``"used_dltensor"`` to ensure that
``PyCapsule_Destructor`` will not get called (ensured if
``PyCapsule_Destructor`` calls ``deleter`` only for capsules whose name
is ``"dltensor"``), but the ``deleter`` of the
``DLManagedTensorVersioned`` (or ``DLManagedTensor``) will be called by
the destructor of the consumer library object created to own the
``DLManagerTensorVersioned`` (or ``DLManagedTensor``) obtained from the
capsule. Below is an example of the capsule deleter written in the Python
C API which is called either when the refcount on the capsule named
``"dltensor"`` reaches zero or the consumer decides to deallocate its array:
``"dltensor"`` reaches zero or the consumer decides to deallocate its
array:

.. code-block:: C

Expand All @@ -96,7 +112,7 @@ C API which is called either when the refcount on the capsule named
PyObject *type, *value, *traceback;
PyErr_Fetch(&type, &value, &traceback);

DLManagedTensor *managed = (DLManagedTensor *)PyCapsule_GetPointer(self, "dltensor");
DLManagedTensorVersioned *managed = (DLManagedTensorVersioned *)PyCapsule_GetPointer(self, "dltensor");
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How can we distinguish between these two cases? For legacy capsules, we must cast to the older struct, right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How can we distinguish between these two cases? For legacy capsules, we must cast to the older struct, right?

The offset of dl_tensor.device doesn't change with this ABI breaking change (and we also don't anticipate such a change in the future). So, casting it to either struct and accessing the field should work.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the location of the deleter function called below does change.

Copy link
Contributor Author

@tirthasheshpatel tirthasheshpatel Apr 7, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the location of the deleter function called below does change.

Oh, sorry. I misunderstood: I thought you were talking about the __dlpack_device__ method. We can have different deleter functions (one for each supported ABI). Not the cleanest way, but can be done using templating.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ahh, this is code created by the producer, not the consumer. When the consumer calls obj.__dlpack__, the producer creates a capsule, and the capsule's deleter function consists of the code here.

if (managed == NULL) {
PyErr_WriteUnraisable(self);
goto done;
Expand All @@ -115,13 +131,14 @@ C API which is called either when the refcount on the capsule named
Note: the capsule names ``"dltensor"`` and ``"used_dltensor"`` must be
statically allocated.

When the ``strides`` field in the ``DLTensor`` struct is ``NULL``, it indicates a
row-major compact array. If the array is of size zero, the data pointer in
``DLTensor`` should be set to either ``NULL`` or ``0``.
When the ``strides`` field in the ``DLTensorVersioned`` (or ``DLTensor``)
struct is ``NULL``, it indicates a row-major compact array. If the array
is of size zero, the data pointer in ``DLTensorVersioned`` (or
``DLTensor``) should be set to either ``NULL`` or ``0``.

DLPack version used must be ``0.2 <= DLPACK_VERSION < 1.0``. For further
details on DLPack design and how to implement support for it,
refer to `github.com/dmlc/dlpack <https://github.com/dmlc/dlpack>`_.
For further details on DLPack design and how to implement support for it,
refer to https://github.com/dmlc/dlpack. For details on ABI compatibility
and to upgrade to the new ABI (version 2), refer to :ref:`future-abi-compat`.

.. warning::
DLPack contains a ``device_id``, which will be the device
Expand All @@ -136,6 +153,45 @@ refer to `github.com/dmlc/dlpack <https://github.com/dmlc/dlpack>`_.
whether the ``.device`` attribute of the array returned from ``from_dlpack`` is
guaranteed to be in a certain order or not.

.. _future-abi-compat:

Future ABI Compatibility
~~~~~~~~~~~~~~~~~~~~~~~~

ABI version 1 did not provide any fields in the structs ``DLTensor`` or
``DLManagedTensor`` to export version info. Two equivalent structs,
``DLTensorVersioned`` and ``DLManagedTensorVersioned``, have been added
since ABI version 2 (DLPack version 0.7.0) and have a ``version`` field
that can be used to export version info and check if the producer's
DLPack version is compatible with the consumer's DLPack version. This
section gives a path for Python libraries to upgrade to the new ABI
(while preserving support for the old ABI):

* ``__dlpack__`` should accept a ``version`` (int) keyword which is set to
``None`` by default. Consumers can use this kwarg to request certain DLPack
versions. If ``version=None`` or ``version=60`` is requested:

* a capsule named ``"dltensor"`` which uses the old ABI (``DLTensor`` and
``DLManagedTensor``) should be returned (if the producer wants to keep
supporting it) or
* a ``BufferError`` should be raised (if the producer doesn't want to keep
support for the old ABI)

Otherwise, a capsule named ``"dltensor"`` which uses the new ABI
(``DLTensorVersioned`` and ``DLManagedTensorVersioned``) should be returned.
If the requested version is not supported, ``__dlpack__`` should raise a
``BufferError``.
* Producers should implement a ``__dlpack_info__`` method that returns the
maximum supported DLPack version. If this method does not exist, the consumer
must use the old ABI.
* Consumers should call the ``__dlpack_info__`` method to get the maximum DLPack
version supported by the producer. The consumer should then request a DLPack
version (by passing the ``version`` kwarg to the ``__dlpack__`` method) that
both support e.g. ``min(producer_version, consumer_version)`` or raise a
``BufferError`` if no compatible version exist. If the ``__dlpack_info__``
method can't be found (if the method doesn't exist), the consumer should
fallback to the old API i.e. passing no version keyword to the ``__dlpack__``
method and expecting a capsule pointing to a ``DLManagedTensor`` (old ABI).

Reference Implementations
~~~~~~~~~~~~~~~~~~~~~~~~~
Expand Down
67 changes: 67 additions & 0 deletions docs/source/release_notes.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,67 @@
.. _release-notes:

DLPack Change Log
=================

This file records the changes in DLPack in reverse chronological order.

v0.7
~~~~

Note: This release contains ABI breaking changes.

- ABI version has been added as ``DLPACK_ABI_VERSION`` macro in the header file.
- Support for OneAPI (``kDLOneAPI``), WebGPU (``kDLWebGPU``), and Hexagon
(``kDLHexagon``) devices has been added.
- Two new structs with a field to export the version info and the readonly
flag have been added: ``DLTensorVersioned`` and ``DLManagedTensorVersioned``.
New implementations should use these structs over ``DLTensor`` and
``DLManagedTensor``. If you have already added support for DLPack, it should
be updated to use the new structs instead (warning: this is an ABI breaking
change for C/C++ libraries. Python libraries should follow the
:ref:`future-abi-compat` section to upgrade to the new ABI without breaking
backward-compatibility).

v0.6
~~~~

- Support for ROCm host memory and CUDA managed memory has been added.

v0.5
~~~~

- Devices kDLGPU and kDLCPUPinned have been renamed to kDLCUDA and kDLCUDAHost
respectively.

v0.4
~~~~

- OpaqueHandle type
- Complex support
- Rename DLContext -> DLDevice
- This requires dependent frameworks to upgrade the type name.
- The ABI is backward compatible, as it is only change of constant name.

v0.3
~~~~

- Add bfloat16
- Vulkan support


v0.2
~~~~

- New device types
- kDLMetal for Apple Metal device
- kDLVPI for verilog simulator memory
- kDLROCM for AMD GPUs
- Add prefix DL to all enum constant values
- This requires dependent frameworks to upgrade their reference to these constant
- The ABI is compatible, as it is only change of constant name.
- Add DLManagedTensor structure for borrowing tensors

v0.1
~~~~

- Finalize DLTensor structure
Loading