-
Notifications
You must be signed in to change notification settings - Fork 16
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Hdf5 improvements #7
base: master
Are you sure you want to change the base?
Conversation
|
||
string(REGEX REPLACE "([^;]+)" "${IFC_LIBRARY_DIR}/\\1.lib" IFC_LIBRARIES "${IFC_LIBRARY_NAMES}") | ||
|
||
if (${ifcopenshell_major_version} STREQUAL "0.5") | ||
set(ICU_LIBRARY_NAMES icuuc icudt) | ||
set(ICU_LIBRARY_DIR ${IFCOPENSHELL_ROOT}/deps-vs${MSVC_YEAR}-x86-installed/icu/lib) | ||
set(ICU_INCLUDE_DIR ${IFCOPENSHELL_ROOT}/deps-vs${MSVC_YEAR}-x86-installed/icu/include) | ||
set(ICU_LIBRARY_DIR ${IFCOPENSHELL_ROOT}/_deps-vs${MSVC_YEAR}-${WIN_ARCH}-installed/icu/lib) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add the time of the 0.5 build script the dir was still called deps
not _deps
can you change that back?
|
||
// Write a 4D dataset | ||
|
||
const H5std_string FILE_NAME("multidim.h5"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What does the remainder of this test do? Should we open multi_dim_vox.h5 with RDONLY and check whether the dimensions and ranks of the datasets make sense?
Or even better: let's say we compare storage3->count()
with std::accumulate(dataset.read())
(oversimplified the last) that way we have an actual test with ASSERT_EQUAL.
@@ -0,0 +1,105 @@ | |||
#include "../voxelizer.h" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's the significant difference between this one and test_hdf?
class voxel_writer { | ||
private: | ||
|
||
#define BEGIN_LOOP(i0, i1, j0, j1, k0, k1) {\ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These are already defined in storage.h which is included
BEGIN_LOOP(size_t(0), nchunks_x, 0U, nchunks_y, 0U, nchunks_z) | ||
auto c = storage->get_chunk(ijk); | ||
|
||
if (c == nullptr) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you make the whitespace formatting a bit consistent here?
The general rule (although undocumented still) for this project is } else {
and the indentation within BEGIN_LOOP is also not consistent.
|
||
const int RANK = 4; | ||
|
||
hsize_t dimsf[4]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I find all these variable declarations not immensely useful. I know they do it in the samples, but for me it creates an extra indirection between NX and storage->chunk_size(), I'd rather have a more meaningful name for the variables so that we understand the intent.
I'd also prefer the syntax:
hsize_t dimsf[4] = {continuous_count, chunk_size, chunk_size, chunk_size };
And please remove all the unnecessary spaces between type and variable name. If any of them remain after the reorganization.
hsize_t NZ
^^^^^^
dimsf[RANK]
for consistency
With this advise most of the lines can be eliminated making it much more clear for the casual reader what's going on.
std::vector<uint32_t> hslab; | ||
c->Get(ijk, &v); | ||
hslab.push_back(v); | ||
dataset.write(hslab.data(), H5::PredType::NATIVE_INT, mspace2, dataspace); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This doesn't seem correct, uint32_t is not the same as NATIVE_INT, I'm quite sure this will be completely wrong in 64bit systems. In case c->value_bits() == 32
, the dataset type should be H5::PredType::NATIVE_UINT32
offset[1] = ijk.get(0); | ||
offset[2] = ijk.get(1); | ||
offset[3] = ijk.get(2); | ||
dataspace.selectHyperslab(H5S_SELECT_SET, slab_dimsf, offset); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To be honest I'm not entirely sure of the performance overhead, could be minimal, could be huge, but writing every individual voxel to file separately seems intuitively to me to be very inefficient. HDF5 is not a simple format, it does a lot of stuff on reading and writing witch caching and chunking. So it seems to me that this will create a lot of overhead for individual writes. Even for simple formats it's faster to write large chunks of data at the same time.
I propose to leave it as is for now though, but please make a comment that we should investigate the performance of this.
if (c->value_bits() == 32) { | ||
|
||
uint32_t v; | ||
std::vector<uint32_t> hslab; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
At the very least for now, let's not create a vector, if you know it's always size one, just use a single uint32_t v
and write that directly dataset.write(&v, ...
} | ||
|
||
else { | ||
std::vector<int> hslab = { c->Get(ijk) }; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd make this more explicit.
int v = c->Get(ijk) ? 1 : 0.
So that it's clear what value we're using for the booleans (even though it's super obvious).
Again
dataset.write(&v, ...
In this case it seems wasteful to use NATIVE_INT (64bit) for a single bit. I would use NATIVE_UINT8, which corresponds to the c++ type uint8_t (8bit)
No description provided.