Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update documentation on nullable variable attributes #1330

Open
mamscience opened this issue Sep 22, 2022 · 3 comments
Open

Update documentation on nullable variable attributes #1330

mamscience opened this issue Sep 22, 2022 · 3 comments

Comments

@mamscience
Copy link

mamscience commented Sep 22, 2022

Hi team,

Could you update the python-section of the documentation? I have no clue how to approach nullable variable attributes without an example
.
https://docs.tiledb.com/main/how-to/arrays/writing-arrays/nullable-attributes

Many thanks,
Michel

@ihnorton
Copy link
Member

Hi @mamscience, at the moment, we support pandas nullable datatypes transparently in TileDB-Py. We'll get the docs there updated ASAP, but in the meantime please have a look at these tests. We're also planning to add a new API for writing and querying nullable attributes soon (it will accept or return a numpy bool vector). If you have a particular package/API of interest, please let us know, we may not be able to integrate directly in the short term, but we'll keep it in mind to try to maximize interoperability.

@ihnorton ihnorton transferred this issue from TileDB-Inc/TileDB Sep 23, 2022
@mamscience
Copy link
Author

mamscience commented Sep 23, 2022

Thanks, apparently, changing the numpy type to float and passing "None" also did the trick.

schema = tiledb.ArraySchema(
        domain=dom, sparse=True, attrs=[
            tiledb.Attr(name="a, dtype=np.int32, nullable=True),
            tiledb.Attr(name="b", dtype=np.float32, nullable=True)
             ]
    )

snip

with tiledb.SparseArray(array_name, mode="w") as A:
        I, J = [1, 1, 1], [1, 2, 3]
        a = np.array([60, 65,64])
        b = np.array([120, 122, None])
        A[I, J] = {"val a":a, "val b":b}`

@mamscience
Copy link
Author

mamscience commented Sep 23, 2022

The underlying issue btw is that I must pass all attributes when writing to array, while (my) sparse arrays are mostly empty. Solving this issue by relaxing the constraints (TileDB-Inc/TileDB#1162 (comment)) was proposed before .

But let's say I need to use integer instead of float attribute, how should one approach this?
Did you mean that I compile and populate a pandas df and convert it to tiledb array? Or is there another way to create tiledb array with pd datatypes

thanks in advance

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants