Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Correct full_domain=True For from_pandas #1239

Open
wants to merge 3 commits into
base: dev
Choose a base branch
from

Conversation

nguyenv
Copy link
Collaborator

@nguyenv nguyenv commented Jul 21, 2022

  • Previously, the tile_max was only correctly calculated for
    np.int64. This correctly set the dim_max to dtype_max - tile
    but only for np.int64
  • All other integer dtypes had dim_max set to dtype_max which
    resulted in a domain range that was too large
  • We also need to account for when the tile extent is larger than
    the range of the full domain. For an instance, when we have a dim
    dtype of np.int8, the default tile extent of 10000 will be much
    larger than the range of the full domain

@nguyenv
Copy link
Collaborator Author

nguyenv commented Jul 21, 2022

I am running into one issue though. When I try to create the full domain for np.int8:

>>> tiledb.Dim(domain=(-128, 126), tile=254, dtype=np.int8)

I get this error:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "tiledb/libtiledb.pyx", line 1982, in tiledb.libtiledb.Dim.__init__
  File "tiledb/libtiledb.pyx", line 1963, in tiledb.libtiledb.Dim.__init__
  File "tiledb/libtiledb.pyx", line 579, in tiledb.libtiledb.check_error
  File "tiledb/libtiledb.pyx", line 575, in tiledb.libtiledb._raise_ctx_err
  File "tiledb/libtiledb.pyx", line 560, in tiledb.libtiledb._raise_tiledb_error
tiledb.cc.TileDBError: [TileDB::Dimension] Error: Tile extent check failed; Tile extent exceeds dimension domain range

But I don't think I should be?

@nguyenv nguyenv requested a review from ihnorton July 22, 2022 18:08
@ihnorton
Copy link
Member

I'll have to look closer later, it fails even with tile=120. I think the error message for what's happening here has regressed.

@nguyenv
Copy link
Collaborator Author

nguyenv commented Aug 24, 2022

[sc-8097]
[sc-20752]

Taking a look at this branch again due to a recent related bug.

@shortcut-integration
Copy link

* Previously, the `tile_max` was only correctly calculated for
  `np.int64`. This correctly set the `dim_max` to `dtype_max - tile`
  but only for `np.int64`
* All other integer dtypes had `dim_max` set to `dtype_max` which
  resulted in a domain range that was too large
* We also need to account for when the tile extent is larger than
  the range of the full domain. For an instance, when we have a dim
  dtype of `np.int8`, the default tile extent of 10000 will be much
  larger than the range of the full domain
@nguyenv nguyenv force-pushed the viviannguyen/correct-full-domain-from-pandas branch from 38634c1 to def094a Compare August 24, 2022 15:47
* I was mistakenly under the impression that
  `tiledb.Dim(domain=(-128, 126), tile=254, dtype=np.int8)` was
  a valid tile extent for int8. But since the max value of int8 is
  128, not 256, we can't have an extent larger than that.
@nguyenv
Copy link
Collaborator Author

nguyenv commented Aug 24, 2022

I was mistakenly under the impression that tiledb.Dim(domain=(-128, 126), tile=254, dtype=np.int8) was a valid tile extent for int8. But since the max value of int8 is 128, not 256, we can't have an extent larger than 128. I've corrected the logic in calculating tile size with this in mind.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants