-
Notifications
You must be signed in to change notification settings - Fork 287
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
chore: iavl/v2 alpha6 [DNM] #1043
Draft
kocubinski
wants to merge
2
commits into
cosmos:master
Choose a base branch
from
kocubinski:master
base: master
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Draft
Changes from all commits
Commits
Show all changes
2 commits
Select commit
Hold shift + click to select a range
File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1 +1,62 @@ | ||
# IAVL v2 | ||
# iavl/v2 | ||
|
||
IAVL v2 is performance minded rewrite of IAVL v1. Benchmarks show a 10-20x improvement in | ||
throughput depending on the operation. The primary changes are: | ||
|
||
- Checkpoints: periodic writes of dirty branch nodes to disk. | ||
- Leaf changelog: leaf nodes are flushed to disk at every version. | ||
- Replay: revert the tree to a previous version by replaying the leaf changelog. | ||
- Sharding: shards are created on pruning events. | ||
- BTree on disk: SQLite (a mature BTree implementation) is used for storage. | ||
- Cache: the AVL tree is cached in memory and (non-dirty) nodes evicted by configurable policy. | ||
|
||
## Concepts | ||
|
||
### Checkpoints | ||
|
||
A checkpoint writes all dirty branch nodes currently in memory since the last checkpoint to | ||
disk. Checkpoints are distinct from shards. One shard may contain multiple checkpoints. A checkpoint occurs | ||
at a configurable interval or when the dirty branch nodes exceed a threshold. | ||
|
||
### Leaf Changelog | ||
|
||
The leaf changelog is a list of leaf nodes that have been written since the last checkpoint. Inserts and | ||
updates are in one table, deletes in another. They are ordered by a sequence number per version to allow for | ||
deterministic replay. The also makes it possible to evict leafs from the tree and rely on SQLite's | ||
page cache and memory map to manage efficient access for leaves. | ||
|
||
### Replay | ||
|
||
Replay is the process of reverting the tree to a previous version. Given a version v, the tree is loaded at | ||
the check version m less than or equal to v. The leaf changelog is replayed from m to v. The tree is now at | ||
version v. | ||
|
||
This is useful for rolling back, or querying and proving the state of the tree at a previous version. | ||
|
||
### Sharding | ||
|
||
A shard contains all the changes to a tree from version m to version n. It may contain multiple checkpoints. | ||
|
||
### BTree (SQLite) | ||
|
||
Why SQLite? A B+Tree is a very efficient on disk data structure. The ideal implementation of IAVL on disk | ||
would be to lay out nodes in subtrees chunks in the same format as the in-memory AVL tree. A B+Tree is a | ||
as close an approximation to this as possible. | ||
|
||
## Pruning | ||
|
||
Parameters: | ||
|
||
- invalidated ratio: the ratio of invalidated nodes to total nodes in a shard that triggers a | ||
pruning event. The default is 1.5. Roughly correleates to disk size of a complete tree, where (2 * ratio) is the size of the pre preuned, tree on disk. A ratio of 1.5 means that 3x the initial size should be provisioned. | ||
- minumum keep versions: the minimum number of versions to keep. This is a safety feature to | ||
prevent pruning to a version that is too recent. The default is 100. | ||
|
||
Pruning events only occur on checkpoint boundaries. The prune version is the most recent check | ||
point less than or equal to the requested prune version. | ||
|
||
On prune the latest shard is locked (readonly) and a new shard is created. The new shard is now | ||
the hot shard and subsequent SaveVersion calls write leafs and branches to it. | ||
|
||
Deletes happen by writing a new shard without orphans, updating the shard connection, then | ||
dropping the old one. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,155 @@ | ||
package bench | ||
|
||
import ( | ||
"net/http" | ||
"os" | ||
"runtime/pprof" | ||
"testing" | ||
"time" | ||
|
||
"github.com/cosmos/iavl/v2" | ||
"github.com/cosmos/iavl/v2/metrics" | ||
"github.com/cosmos/iavl/v2/testutil" | ||
"github.com/prometheus/client_golang/prometheus" | ||
"github.com/prometheus/client_golang/prometheus/promauto" | ||
"github.com/prometheus/client_golang/prometheus/promhttp" | ||
"github.com/spf13/cobra" | ||
"github.com/stretchr/testify/require" | ||
) | ||
|
||
func Command() *cobra.Command { | ||
cmd := &cobra.Command{ | ||
Use: "bench", | ||
Short: "run benchmarks", | ||
} | ||
cmd.AddCommand(benchCommand()) | ||
return cmd | ||
} | ||
|
||
func benchCommand() *cobra.Command { | ||
var ( | ||
dbPath string | ||
changelogPath string | ||
loadSnapshot bool | ||
usePrometheus bool | ||
cpuProfile string | ||
) | ||
cmd := &cobra.Command{ | ||
Use: "std", | ||
Short: "run the std development benchmark", | ||
Long: `Runs a longer benchmark for the IAVL tree. This is useful for development and testing. | ||
Pre-requisites this command: | ||
$ go run ./cmd gen tree --db /tmp/iavl-v2 --limit 1 --type osmo-like-many | ||
mkdir -p /tmp/osmo-like-many/v2 && go run ./cmd gen emit --start 2 --limit 1000 --type osmo-like-many --out /tmp/osmo-like-many/v2 | ||
|
||
Optional for --snapshot arg: | ||
$ go run ./cmd snapshot --db /tmp/iavl-v2 --version 1 | ||
`, | ||
|
||
RunE: func(_ *cobra.Command, _ []string) error { | ||
if cpuProfile != "" { | ||
f, err := os.Create(cpuProfile) | ||
if err != nil { | ||
return err | ||
} | ||
if err := pprof.StartCPUProfile(f); err != nil { | ||
return err | ||
} | ||
defer func() { | ||
pprof.StopCPUProfile() | ||
f.Close() | ||
}() | ||
} | ||
t := &testing.T{} | ||
treeOpts := iavl.DefaultTreeOptions() | ||
treeOpts.CheckpointInterval = 80 | ||
treeOpts.StateStorage = true | ||
treeOpts.HeightFilter = 1 | ||
treeOpts.EvictionDepth = 22 | ||
treeOpts.PruneRatio = 0 | ||
treeOpts.MetricsProxy = metrics.NewStructMetrics() | ||
if usePrometheus { | ||
treeOpts.MetricsProxy = newPrometheusMetricsProxy() | ||
} | ||
|
||
var multiTree *iavl.MultiTree | ||
if loadSnapshot { | ||
pool := iavl.NewNodePool() | ||
var err error | ||
multiTree, err = iavl.ImportMultiTree(pool, 1, dbPath, treeOpts) | ||
require.NoError(t, err) | ||
} else { | ||
multiTree = iavl.NewMultiTree(dbPath, treeOpts) | ||
require.NoError(t, multiTree.MountTrees()) | ||
require.NoError(t, multiTree.LoadVersion(1)) | ||
require.NoError(t, multiTree.WarmLeaves()) | ||
} | ||
|
||
opts := testutil.CompactedChangelogs(changelogPath) | ||
opts.SampleRate = 250_000 | ||
|
||
// opts.Until = 1_000 | ||
// opts.UntilHash = "557663181d9ab97882ecfc6538e3b4cfe31cd805222fae905c4b4f4403ca5cda" | ||
opts.Until = 500 | ||
opts.UntilHash = "2670bd5767e70f2bf9e4f723b5f205759e39afdb5d8cfb6b54a4a3ecc27a1377" | ||
|
||
multiTree.TestBuild(t, opts) | ||
return nil | ||
}, | ||
} | ||
cmd.Flags().StringVar(&dbPath, "db", "/tmp/iavl-v2", "the path to the database at version 1") | ||
cmd.Flags().StringVar(&changelogPath, "changelog", "/tmp/osmo-like-many/v2", "the path to the changelog") | ||
cmd.Flags().BoolVar(&loadSnapshot, "snapshot", false, "load the snapshot at version 1 before running the benchmarks (loads full tree into memory)") | ||
cmd.Flags().BoolVar(&usePrometheus, "prometheus", false, "enable prometheus metrics") | ||
cmd.Flags().StringVar(&cpuProfile, "cpu-profile", "", "write cpu profile to file") | ||
|
||
if err := cmd.MarkFlagRequired("changelog"); err != nil { | ||
panic(err) | ||
} | ||
if err := cmd.MarkFlagRequired("db"); err != nil { | ||
panic(err) | ||
} | ||
return cmd | ||
} | ||
|
||
var _ metrics.Proxy = &prometheusMetricsProxy{} | ||
|
||
type prometheusMetricsProxy struct { | ||
workingSize prometheus.Gauge | ||
workingBytes prometheus.Gauge | ||
} | ||
|
||
func newPrometheusMetricsProxy() *prometheusMetricsProxy { | ||
p := &prometheusMetricsProxy{} | ||
p.workingSize = promauto.NewGauge(prometheus.GaugeOpts{ | ||
Name: "iavl_working_size", | ||
Help: "working size", | ||
}) | ||
p.workingBytes = promauto.NewGauge(prometheus.GaugeOpts{ | ||
Name: "iavl_working_bytes", | ||
Help: "working bytes", | ||
}) | ||
http.Handle("/metrics", promhttp.Handler()) | ||
go func() { | ||
err := http.ListenAndServe(":2112", nil) | ||
if err != nil { | ||
panic(err) | ||
} | ||
}() | ||
return p | ||
} | ||
|
||
func (p *prometheusMetricsProxy) IncrCounter(_ float32, _ ...string) { | ||
} | ||
|
||
func (p *prometheusMetricsProxy) SetGauge(val float32, keys ...string) { | ||
k := keys[1] | ||
switch k { | ||
case "working_size": | ||
p.workingSize.Set(float64(val)) | ||
case "working_bytes": | ||
p.workingBytes.Set(float64(val)) | ||
} | ||
} | ||
|
||
func (p *prometheusMetricsProxy) MeasureSince(_ time.Time, _ ...string) {} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add proper HTTP server shutdown handling.
The HTTP server is started in a goroutine without proper shutdown handling. This could lead to resource leaks.
Consider implementing graceful shutdown:
📝 Committable suggestion