Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support Arrow Tensor type #108

Open
eddyxu opened this issue Aug 15, 2022 · 3 comments
Open

Support Arrow Tensor type #108

eddyxu opened this issue Aug 15, 2022 · 3 comments
Assignees
Labels
arrow Apache Arrow related issues enhancement New feature or request help wanted Extra attention is needed python rust Rust related tasks

Comments

@eddyxu
Copy link
Contributor

eddyxu commented Aug 15, 2022

edit: use arrow::extension::FixedShapeTensorType

https://arrow.apache.org/docs/dev/format/CanonicalExtensions.html#fixed-shape-tensor-extension

Problem

In arrow, Tensors do not have corresponding Array type, as it can not be used to store in a Table / Dataset / RecordBatch.

To support ML datasets, it is desirable to have tensors stored within the datasets.

Desired Behavior

To support parametrized Tensors in a Table / Datasets. A TensorArray and TensorType should allow parameters, i.e., shade, data type (i.e., bid_width) and etc.

class TensorType {
   TensorType(data_type, shade, ...)
}
@eddyxu eddyxu added c++ C++ issues arrow Apache Arrow related issues enhancement New feature or request help wanted Extra attention is needed python labels Aug 15, 2022
@changhiskhan changhiskhan removed the c++ C++ issues label Jul 2, 2023
@changhiskhan changhiskhan changed the title Make a TensorType to allow write arrow.Tensors into Tables Support Arrow Tensor type Jul 2, 2023
@changhiskhan changhiskhan added the rust Rust related tasks label Jul 2, 2023
@rok rok self-assigned this Sep 5, 2023
@rok
Copy link
Contributor

rok commented Sep 29, 2023

arrow::extension::FixedShapeTensorType and arrow::extension::VariableShapeTensorType will provide getting individual tensors from their respective arrays. arrow::extension::FixedShapeTensorArray provides to_numpy_ndarray on sliced array allowing for batch-sized tensors. I think this covers our desired use case.
Am I missing something or can we close this ticket?

@rok
Copy link
Contributor

rok commented Oct 2, 2023

@eddyxu

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
arrow Apache Arrow related issues enhancement New feature or request help wanted Extra attention is needed python rust Rust related tasks
Projects
None yet
Development

No branches or pull requests

4 participants