-
Notifications
You must be signed in to change notification settings - Fork 20.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
triedb/pathdb, eth: introduce Double-Buffer Mechanism in PathDB #30464
base: master
Are you sure you want to change the base?
Conversation
b48c0c9
to
20b4ffd
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All in all, this looks promising, I suspect this could help quite a bit
nodes := writeNodes(batch, b.nodes, clean) | ||
rawdb.WritePersistentStateID(batch, id) | ||
|
||
// Flush all mutations in a single batch |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note: at this point, mutations were already applied on the clean
, i.e, dl.cleans
cache. That happened during writeNodes
. I've tried to figure out if that is a problem, but come to the conclusion that it's fine, but just wanted to raise it so you can also give it a think.
Regarding "flush all mutations in a single batch" -- is that important only because of crash-safety, or some other more subtle reason?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about this
in disklayer.go
, function node()
, we lookup a node. Order:
- buffer
- frozen
- cleans
- database
- And if found, write to cleans
if dl.cleans != nil && len(blob) > 0 {
dl.cleans.Set(key, blob)
cleanWriteMeter.Mark(int64(len(blob)))
}
I'm trying to think of a case where this write-to-cleans conflicts with the write-to-cleans in the background committer writeNodes
method.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- if it's found in buffer/frozen => return and no interaction with cache
- if it's found in cache => return
- if it's found in disk (it implicitly means the item is not in these places above, even the item is marked as deleted, it will still be caught in buffer/frozen/cache), load it from db and add it into the cache
so, no conflict should happen
But i have to say it's a really good point, i haven't thought about it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Regarding "flush all mutations in a single batch" -- is that important only because of crash-safety, or some other more subtle reason?
Only because of crash-safety
Previously, PathDB used a single buffer to aggregate database writes, which needed to be flushed atomically. However, flushing large amounts of data (e.g., 256MB) caused significant overhead, often blocking the system for around 3 seconds during the flush. To mitigate this overhead and reduce performance spikes, a double-buffer mechanism is introduced. When the active buffer fills up, it is marked as frozen and a background flushing process is triggered. Meanwhile, a new buffer is allocated for incoming writes, allowing operations to continue uninterrupted. This approach reduces system blocking times and provides flexibility in adjusting buffer parameters for improved performance.
432633f
to
fc0cd1e
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, would be interesting to see some performance-charts. This PR needs some runtime before merging, IMO
For sure, it’s not a please-merge-it pull request, it will be twisted a bit
and have a full performance impact inspection
Thanks and Best regards Gary rong
Martin HS ***@***.***>于2024年9月23日 周一下午2:49写道:
… ***@***.**** approved this pull request.
LGTM, would be interesting to see some performance-charts. This PR needs
some runtime before merging, IMO
—
Reply to this email directly, view it on GitHub
<#30464 (review)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ABNO6OOSVOX3WRFPVBR6AJ3ZX62XNAVCNFSM6AAAAABOPFBYCCVHI2DSMVQWIX3LMV43YUDVNRWFEZLROVSXG5CSMV3GSZLXHMZDGMRRGI2DKNRVGY>
.
You are receiving this because you authored the thread.Message ID:
***@***.***>
|
Referenced #28471 ? |
Previously, PathDB used a single buffer to aggregate database writes, which needed to be flushed atomically. However, flushing large amounts of data (e.g., 256MB) caused significant overhead, often blocking the system for around 3 seconds during the flush.
To mitigate this overhead and reduce performance spikes, a double-buffer mechanism is introduced. When the active buffer fills up, it is marked as frozen and a background flushing process is triggered. Meanwhile, a new buffer is allocated for incoming writes, allowing operations to continue uninterrupted.
This approach reduces system blocking times and provides flexibility in adjusting buffer parameters for improved performance.