triedb/pathdb, eth: introduce Double-Buffer Mechanism in PathDB #30464

rjl493456442 · 2024-09-19T06:18:10Z

Previously, PathDB used a single buffer to aggregate database writes, which needed to be flushed atomically. However, flushing large amounts of data (e.g., 256MB) caused significant overhead, often blocking the system for around 3 seconds during the flush.

To mitigate this overhead and reduce performance spikes, a double-buffer mechanism is introduced. When the active buffer fills up, it is marked as frozen and a background flushing process is triggered. Meanwhile, a new buffer is allocated for incoming writes, allowing operations to continue uninterrupted.

This approach reduces system blocking times and provides flexibility in adjusting buffer parameters for improved performance.

holiman

All in all, this looks promising, I suspect this could help quite a bit

triedb/pathdb/nodebuffer.go

holiman · 2024-09-19T06:58:09Z

triedb/pathdb/nodebuffer.go

+		nodes := writeNodes(batch, b.nodes, clean)
+		rawdb.WritePersistentStateID(batch, id)
+
+		// Flush all mutations in a single batch


Note: at this point, mutations were already applied on the clean, i.e, dl.cleans cache. That happened during writeNodes. I've tried to figure out if that is a problem, but come to the conclusion that it's fine, but just wanted to raise it so you can also give it a think.

Regarding "flush all mutations in a single batch" -- is that important only because of crash-safety, or some other more subtle reason?

How about this
in disklayer.go, function node(), we lookup a node. Order:

buffer

frozen

cleans

database

And if found, write to cleans

if dl.cleans != nil && len(blob) > 0 { dl.cleans.Set(key, blob) cleanWriteMeter.Mark(int64(len(blob))) }

I'm trying to think of a case where this write-to-cleans conflicts with the write-to-cleans in the background committer writeNodes method.

if it's found in buffer/frozen => return and no interaction with cache

if it's found in cache => return

if it's found in disk (it implicitly means the item is not in these places above, even the item is marked as deleted, it will still be caught in buffer/frozen/cache), load it from db and add it into the cache

so, no conflict should happen

But i have to say it's a really good point, i haven't thought about it

Regarding "flush all mutations in a single batch" -- is that important only because of crash-safety, or some other more subtle reason?

Only because of crash-safety

Previously, PathDB used a single buffer to aggregate database writes, which needed to be flushed atomically. However, flushing large amounts of data (e.g., 256MB) caused significant overhead, often blocking the system for around 3 seconds during the flush. To mitigate this overhead and reduce performance spikes, a double-buffer mechanism is introduced. When the active buffer fills up, it is marked as frozen and a background flushing process is triggered. Meanwhile, a new buffer is allocated for incoming writes, allowing operations to continue uninterrupted. This approach reduces system blocking times and provides flexibility in adjusting buffer parameters for improved performance.

holiman

LGTM, would be interesting to see some performance-charts. This PR needs some runtime before merging, IMO

rjl493456442 · 2024-09-23T07:13:34Z

For sure, it’s not a please-merge-it pull request, it will be twisted a bit and have a full performance impact inspection Thanks and Best regards Gary rong Martin HS ***@***.***>于2024年9月23日周一下午2:49写道：

…

***@***.**** approved this pull request. LGTM, would be interesting to see some performance-charts. This PR needs some runtime before merging, IMO — Reply to this email directly, view it on GitHub <#30464 (review)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/ABNO6OOSVOX3WRFPVBR6AJ3ZX62XNAVCNFSM6AAAAABOPFBYCCVHI2DSMVQWIX3LMV43YUDVNRWFEZLROVSXG5CSMV3GSZLXHMZDGMRRGI2DKNRVGY> . You are receiving this because you authored the thread.Message ID: ***@***.***>

joeylichang · 2024-10-10T03:05:47Z

Referenced #28471 ？

rjl493456442 requested review from karalabe and holiman as code owners September 19, 2024 06:18

rjl493456442 force-pushed the multibuffer branch 2 times, most recently from b48c0c9 to 20b4ffd Compare September 19, 2024 07:08

holiman reviewed Sep 19, 2024

View reviewed changes

rjl493456442 force-pushed the multibuffer branch from 432633f to fc0cd1e Compare September 23, 2024 05:01

holiman approved these changes Sep 23, 2024

View reviewed changes

holiman mentioned this pull request Oct 14, 2024

all: unify the trie database and snapshot in path mode #30159

Open

rjl493456442 mentioned this pull request Oct 15, 2024

core, trie, triedb: port changes from the snapshot integration #30599

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

triedb/pathdb, eth: introduce Double-Buffer Mechanism in PathDB #30464

triedb/pathdb, eth: introduce Double-Buffer Mechanism in PathDB #30464

rjl493456442 commented Sep 19, 2024

holiman left a comment

holiman Sep 19, 2024

holiman Sep 19, 2024

rjl493456442 Sep 19, 2024 •

edited

Loading

rjl493456442 Sep 19, 2024

holiman left a comment

rjl493456442 commented Sep 23, 2024 via email

joeylichang commented Oct 10, 2024 •

edited

Loading

triedb/pathdb, eth: introduce Double-Buffer Mechanism in PathDB #30464

Are you sure you want to change the base?

triedb/pathdb, eth: introduce Double-Buffer Mechanism in PathDB #30464

Conversation

rjl493456442 commented Sep 19, 2024

holiman left a comment

Choose a reason for hiding this comment

holiman Sep 19, 2024

Choose a reason for hiding this comment

holiman Sep 19, 2024

Choose a reason for hiding this comment

rjl493456442 Sep 19, 2024 • edited Loading

Choose a reason for hiding this comment

rjl493456442 Sep 19, 2024

Choose a reason for hiding this comment

holiman left a comment

Choose a reason for hiding this comment

rjl493456442 commented Sep 23, 2024 via email

joeylichang commented Oct 10, 2024 • edited Loading

rjl493456442 Sep 19, 2024 •

edited

Loading

joeylichang commented Oct 10, 2024 •

edited

Loading