-
Notifications
You must be signed in to change notification settings - Fork 171
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support decompression from byte array to ByteBuffer and vice-versa #351
Conversation
Hi @mortengrouleff! You do raise a good point about a downside of passing byte arrays to zstd-jni. zstd-jni already supports byte-array-to-byte-array compression and decompression, so this PR doesn't introduce byte array usage. I don't agree that it's necessarily a better idea to copy data into (and then out of) a direct ByteBuffer in order to avoid a critical section. That's a lot of CPU cycles. It also doesn't always make sense to re-write the application to always use direct ByteBuffers. For example, HBase has pretty sophisticated memory management where it uses a pool of direct ByteBuffers to hold HBase data. However, there is a limit to the direct memory the JVM is allowed to use. If the pool is empty, HBase will use a heap ByteBuffer (aka byte array). Additionally, some of the in-memory cache implementations that HBase uses for caching data on the server are on-heap. I think the best way forward is to support both byte array and direct ByteBuffer usage in zstd-jni, document the tradeoffs, and let the user decide. |
void *dst_buff = (*env)->GetPrimitiveArrayCritical(env, dst, NULL); | ||
if (dst_buff == NULL) goto E1; | ||
char *src_buff = (char*)(*env)->GetDirectBufferAddress(env, src); | ||
if (src_buff == NULL) return -ZSTD_error_memory_allocation; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this will leave the dst_buff
behind critical lock that is never going to be released. I suggest adding E2 on the line before E1 and jump there.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
another option is to swap the GetPrimitiveArrayCritical
and GetDirectBufferAddress
sections
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's fixed, thank you.
Yes, I don't think there is harm in adding these methods. Also AFAIK with JDK 21 + G2GC we get better behaviour for critical section locks - they got pinned and don't block the GC. Added one comment. I will merge after it's addressed |
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #351 +/- ##
============================================
+ Coverage 60.01% 60.02% +0.01%
- Complexity 308 318 +10
============================================
Files 26 27 +1
Lines 1473 1566 +93
Branches 170 181 +11
============================================
+ Hits 884 940 +56
- Misses 434 460 +26
- Partials 155 166 +11 🚀 New features to boost your workflow:
|
Thank you very much! I've addressed the bug. |
Thanks! Merging it. I will wait another week to see if there are any fall-outs from the recent upgrade to zstd-1.5.7 before releasing new version. |
Thank you! |
@charlesconnell , just pushed it as 1.5.7-2 |
HBase uses zstd-jni for one implemention of its ZStandard support. I've been working on improving ZStandard decompression performance in HBase. My goal is to avoid copying data any more than absolutely necessary. In some situations, we have our compressed data in a byte array, and need to decompress into a direct ByteBuffer, or vice versa. This is not currently possible without an extra copy. In this PR, I expose the methods to accomplish this. My hope is that this PR gets accepted, and HBase can use the next version of zstd-jni to get a little performance boost.