-
Notifications
You must be signed in to change notification settings - Fork 20.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
trie: parallelize committer #30461
trie: parallelize committer #30461
Conversation
Without benchmarks to support it, these kind of changes aren’t “evaluable”.
We’re tried in the past and it’s extremely hard to make such a change, it
will mostly probably be slower rather than faster unfortunately.
…On Wed, 18 Sep 2024 at 18:58, steven ***@***.***> wrote:
@stevemilk <https://github.com/stevemilk> requested your review on: #30461
<#30461> trie: parallize
committer as a code owner.
—
Reply to this email directly, view it on GitHub
<#30461 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAA7UGJHXQ3NOXEIAS5ED23ZXGPJNAVCNFSM6AAAAABOOATFDCVHI2DSMVQWIX3LMV45UABCJFZXG5LFIV3GK3TUJZXXI2LGNFRWC5DJN5XDWMJUGMYTAMJYGE2TONQ>
.
You are receiving this because your review was requested.Message ID:
***@***.***>
|
Got it, will do benchmark test then. |
Benchmark results on my linux machine
|
I also looked at those charts a bit. I'm a bit confused. Where is |
@holiman Yes, i just deployed it on a ongoing benchmark pairs (new release vs last release) to have a quick test. |
True, we need to update them |
} | ||
} | ||
|
||
type wrapNode struct { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The whole concept of wrapping nodes -- I don't see the point in it. Why is that needed? Couldn't you just copy the path for each goroutine, and then let each goroutine work on it's own path-copy individually without risking any cross-goroutine disruptions?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because multiple goroutines calling AddNode
will cause concurrent map writes.
This could be solved by using mutex, but using wrapping nodes to avoid lock/unlock can save a little more time.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, if we're parallelizing things, you could have a chan
where you send off the nodes, and then have a dedicated goroutine which just reads the chan and invokes AddNode
sequentially. If you make the chan use some buffering, then it should be faster than using a lock.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes I did try this approach before. However, it still requires a new struct, similar to WrapNode
, to initialize the chan
since AddNode
requires both the node itself and the path. Additionally, we would need an extra mechanism to monitor when the send operation is completed. Here's the old draft PR illustrating this approach, rough version.
For the idea of parallelizing committer and invoking AddNode
sequentially, current PR can also achieve in a cleaner way, meanwhile just as fast. I believe both approaches are valid and can work well, and I'm open to further discussion if you think there's a strong case for the alternative.
I've attempted to do it differently. Concentrating only on the larg(er) trie, here are my numbers for this PR:
And the unmodified master, with a similar benchmark:
Thus: for non-parallel, this PR is a degradation, most likely because of the additional mem usage, which goes up by ~50%. So speed goes from The memory overhead is what I saw in the graphs above, on the live benchmark nodes. In my branch
With that approach, I get
So the OBS: I have not ascertained that the implementation actually is correct: that the nodes collected are the right ones. So the next step would be to actually test that it does what it's supposed to do. But all in all, I don't think the current |
Fixed up now so it is correct. New numbers on my branch (basically no change):
|
Closing this in favour of #30545 . Thanks for picking it up from the start! |
Make node commit to be able to run in parallel, like node hash in
hasher.go
.