Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Side load snapshot #1864

Open
wants to merge 10 commits into
base: master
Choose a base branch
from
Open

Side load snapshot #1864

wants to merge 10 commits into from

Conversation

ffakenz
Copy link
Contributor

@ffakenz ffakenz commented Feb 25, 2025

Closes #1858

Summary

🐧 introduce new SideLoadSnapshot ClientInput

🐧 introduce new endpoint POST /snapshot

  • calls new ClientInput

🐧 introduce new ServerOutput SnapshotSideLoaded

  • to signal when SideLoadSnapshot has been performed.

🐧 HeadLogic now handles SideLoadSnapshot

  • persists SnapshotSideLoaded event
  • produce SnapshotSideLoaded server output

🐧 HeadLogic handles event SnapshotSideLoaded

  • the new snapshot its been adopted as latest confirmed.

🐧 introduce new endpoint GET /snapshot

  • returns latest ConfirmedSnapshot

🐧 introduce new endpoint POST /snapshot/latest

  • calls new ClientInput using latest ConfirmedSnapshot

  • CHANGELOG updated or not needed
  • Documentation updated or not needed
  • Haddocks updated or not needed
  • No new TODOs introduced or explained herafter

@ffakenz ffakenz self-assigned this Feb 25, 2025
@ffakenz ffakenz requested a review from a team February 25, 2025 16:24
Copy link

github-actions bot commented Feb 25, 2025

Transaction cost differences

No cost or size differences found

Copy link

github-actions bot commented Feb 25, 2025

Transaction costs

Sizes and execution budgets for Hydra protocol transactions. Note that unlisted parameters are currently using arbitrary values and results are not fully deterministic and comparable to previous runs.

Metadata
Generated at 2025-03-12 15:28:03.148796391 UTC
Max. memory units 14000000
Max. CPU units 10000000000
Max. tx size (kB) 16384

Script summary

Name Hash Size (Bytes)
νInitial c8a101a5c8ac4816b0dceb59ce31fc2258e387de828f02961d2f2045 2652
νCommit 61458bc2f297fff3cc5df6ac7ab57cefd87763b0b7bd722146a1035c 685
νHead 0e35115a2c7c13c68ecd8d74e4987c04d4539e337643be20bb3274bd 14756
μHead 57166715eadb8d3135964325c016eea546c21e1c0aae974ca67df9a5* 5541
νDeposit ae01dade3a9c346d5c93ae3ce339412b90a0b8f83f94ec6baa24e30c 1102
  • The minting policy hash is only usable for comparison. As the script is parameterized, the actual script is unique per head.

Init transaction costs

Parties Tx size % max Mem % max CPU Min fee ₳
1 6093 11.23 3.50 0.53
2 6298 13.04 4.04 0.56
3 6493 15.68 4.86 0.60
5 6897 20.07 6.21 0.66
10 7907 31.58 9.75 0.82
40 13933 98.39 30.22 1.78

Commit transaction costs

This uses ada-only outputs for better comparability.

UTxO Tx size % max Mem % max CPU Min fee ₳
1 558 2.44 1.16 0.20
2 740 3.38 1.73 0.22
3 920 4.36 2.33 0.24
5 1277 6.41 3.60 0.28
10 2172 12.13 7.25 0.40
54 10067 98.61 68.52 1.88

CollectCom transaction costs

Parties UTxO (bytes) Tx size % max Mem % max CPU Min fee ₳
1 57 525 25.64 7.39 0.43
2 114 636 33.83 9.73 0.52
3 170 747 45.98 13.03 0.65
4 227 858 50.59 14.55 0.70
5 282 969 62.42 17.77 0.83
6 336 1081 71.40 20.39 0.92
7 393 1192 82.15 23.24 1.03
8 450 1307 97.51 27.30 1.19
9 504 1414 98.45 28.14 1.21

Cost of Increment Transaction

Parties Tx size % max Mem % max CPU Min fee ₳
1 1829 25.42 8.32 0.50
2 1929 27.11 9.49 0.52
3 2153 29.28 10.95 0.56
5 2321 31.66 12.92 0.60
10 3100 42.05 19.60 0.77
38 7027 92.30 54.21 1.60

Cost of Decrement Transaction

Parties Tx size % max Mem % max CPU Min fee ₳
1 604 23.71 7.54 0.42
2 761 25.27 8.65 0.45
3 870 26.48 9.61 0.47
5 1204 31.20 12.27 0.54
10 2158 45.31 19.48 0.75
36 5856 95.89 50.69 1.55

Close transaction costs

Parties Tx size % max Mem % max CPU Min fee ₳
1 640 30.95 9.68 0.50
2 888 31.73 10.76 0.52
3 868 34.13 11.88 0.55
5 1223 36.32 14.07 0.60
10 1869 45.32 20.15 0.74
34 5780 98.74 53.86 1.59

Contest transaction costs

Parties Tx size % max Mem % max CPU Min fee ₳
1 629 35.29 10.70 0.54
2 815 38.15 12.32 0.58
3 942 40.36 13.65 0.62
5 1312 45.61 16.72 0.70
10 2124 58.56 24.28 0.89
26 4669 98.67 47.76 1.50

Abort transaction costs

There is some variation due to the random mixture of initial and already committed outputs.

Parties Tx size % max Mem % max CPU Min fee ₳
1 5988 28.20 9.30 0.71
2 6057 34.04 11.17 0.77
3 6277 47.84 15.82 0.93
4 6430 58.31 19.32 1.04
5 6582 67.89 22.46 1.15
6 6875 79.90 26.68 1.29
7 6792 82.55 27.25 1.31
8 6989 98.78 32.60 1.49

FanOut transaction costs

Involves spending head output and burning head tokens. Uses ada-only UTXO for better comparability.

Parties UTxO UTxO (bytes) Tx size % max Mem % max CPU Min fee ₳
10 0 0 6092 18.75 6.15 0.61
10 1 57 6125 21.79 7.29 0.65
10 5 284 6260 30.77 10.75 0.75
10 10 569 6430 40.96 14.72 0.87
10 20 1138 6769 61.80 22.83 1.11
10 37 2106 7348 98.68 37.10 1.54

End-to-end benchmark results

This page is intended to collect the latest end-to-end benchmark results produced by Hydra's continuous integration (CI) system from the latest master code.

Please note that these results are approximate as they are currently produced from limited cloud VMs and not controlled hardware. Rather than focusing on the absolute results, the emphasis should be on relative results, such as how the timings for a scenario evolve as the code changes.

Generated at 2025-03-12 15:30:51.611782551 UTC

Baseline Scenario

Number of nodes 1
Number of txs 300
Avg. Confirmation Time (ms) 4.682642310
P99 10.968852049999995ms
P95 5.831514000000002ms
P50 4.269201ms
Number of Invalid txs 0

Memory data

Time Used Free
2025-03-12 15:29:35.262780467 UTC 947M 6124M
2025-03-12 15:29:40.262753223 UTC 1041M 5994M
2025-03-12 15:29:45.262754438 UTC 1041M 5993M
2025-03-12 15:29:50.262733645 UTC 1041M 5993M
2025-03-12 15:29:55.262680312 UTC 1041M 5992M
2025-03-12 15:30:00.262736546 UTC 1043M 5990M

Three local nodes

Number of nodes 3
Number of txs 900
Avg. Confirmation Time (ms) 30.877023852
P99 70.33999927ms
P95 52.579133299999995ms
P50 27.640528500000002ms
Number of Invalid txs 0

Memory data

Time Used Free
2025-03-12 15:30:13.607470014 UTC 958M 6085M
2025-03-12 15:30:18.607759797 UTC 1225M 5817M
2025-03-12 15:30:23.610478761 UTC 1280M 5703M
2025-03-12 15:30:28.607673596 UTC 1287M 5642M
2025-03-12 15:30:33.607478733 UTC 1291M 5638M
2025-03-12 15:30:38.607503176 UTC 1295M 5633M
2025-03-12 15:30:43.607484695 UTC 1298M 5629M
2025-03-12 15:30:48.607903124 UTC 1299M 5627M

@ffakenz ffakenz force-pushed the side-load-snapshot branch 2 times, most recently from 5a5e550 to 11fb205 Compare February 27, 2025 20:44
@ffakenz ffakenz force-pushed the side-load-snapshot branch 8 times, most recently from 5c79937 to d3c1bc9 Compare March 6, 2025 10:41
@noonio noonio linked an issue Mar 6, 2025 that may be closed by this pull request
@ffakenz ffakenz force-pushed the side-load-snapshot branch 7 times, most recently from 24e8e1d to 858ebb6 Compare March 9, 2025 21:16
@ffakenz ffakenz marked this pull request as ready for review March 9, 2025 21:40
@ffakenz ffakenz force-pushed the side-load-snapshot branch 7 times, most recently from a65a286 to 9142413 Compare March 12, 2025 14:51
@ffakenz ffakenz force-pushed the side-load-snapshot branch 6 times, most recently from 3d3e975 to 0f21cd7 Compare March 12, 2025 15:23
Copy link
Contributor

@v0d1ch v0d1ch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice work @ffakenz Perhaps we should also have a mob review session since if feels like it would be beneficial to go through more possible scenarios together.


As a result of this divergence, the local ledger state of each Hydra node essentially becomes forked, resulting in inconsistent states across the network. While it is technically still possible to submit transactions to the Hydra nodes, doing so is ineffective because snapshots do not update unless all nodes sign them. Each node starts accepting transactions based on entirely different states, leading to disagreement on which UTxOs have been spent.

To recover from this issue, we introduced side-loading of snapshots to synchronize the local ledger state of the Hydra nodes. With this mechanism, every peer reverts to the latest confirmed snapshot. This can be done using the POST /snapshot/latest endpoint, which clears the peer’s local pending transactions and restores its state to match the last agreed snapshot, allowing the node to rejoin the consensus with the rest of the network.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the changelog you mention POST /snapshot but here the route is different, here POST /snapshot/latest endpoint

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see... there are two routes actually.

send n3 $ input "NewTx" ["transaction" .= tx]

-- Everyone confirms it
-- Note: We can't use `waitForAlMatch` here as it expects them to
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
-- Note: We can't use `waitForAlMatch` here as it expects them to
-- Note: We can't use `waitForAllMatch` here as it expects them to

operationId: sideLoadLatestConfirmedSnapshotRequest
message:
payload:
type: "null"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need message here if payload is null?

@@ -936,8 +936,7 @@ onOpenChainDepositTx headId env st deposited depositTxId deadline =
waitOnUnresolvedDecommit $
newState CommitRecorded{headId, pendingDeposits = Map.singleton depositTxId deposited, newLocalUTxO = localUTxO <> deposited, utxoToCommit = deposited, pendingDeposit = depositTxId, deadline}
<> if not snapshotInFlight && isLeader parameters party nextSn
then
cause (NetworkEffect $ ReqSn version nextSn (txId <$> localTxs) Nothing (Just deposited))
then cause (NetworkEffect $ ReqSn version nextSn (txId <$> localTxs) Nothing (Just deposited))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This line can be reverted

@@ -1248,6 +1333,14 @@ update env ledger st ev = case (st, ev) of
onOpenChainCloseTx openState newChainState closedSnapshotNumber contestationDeadline
| otherwise ->
Error NotOurHead{ourHeadId, otherHeadId = headId}
(Open openState@OpenState{headId = ourHeadId}, ClientInput (SideLoadSnapshot confirmedSnapshot)) ->
let otherHeadId =
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You could use getSnapshot function here and then just extract the headId from the Snapshot

case sideLoadConfirmedSnapshot of
InitialSnapshot{} ->
if currentConfirmedSnapshot == sideLoadConfirmedSnapshot
then -- Spec: ̅S ← snObj(v̂, ŝ, Û, T̂, 𝑈𝛼, 𝑈𝜔)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do these additions require spec changes? If so are they already done? I am not sure if you should have spec comments here unless this piece of code has relevant spec parts.

@@ -94,6 +94,7 @@ monitor transactionsMap metricsMap = \case
-- transactions after some timeout expires
atomically $ modifyTVar' transactionsMap (Map.insert (txId tx) t)
tick "hydra_head_requested_tx"
-- REIVEW! should we handle SnapshotSideLoaded ???
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it would be a nice addition since it is as important as SnapshotConfirmed imo

let tx1 = SimpleTx 1 mempty (utxoRef 2) -- No inputs, requires no specific starting state
tx2 = SimpleTx 2 (utxoRef 2) (utxoRef 3)
tx3 = SimpleTx 3 (utxoRef 3) (utxoRef 4)
snapshot1 = Snapshot testHeadId 0 1 [tx1] (utxoRef 2) Nothing Nothing
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you could test also how utxoToDe/Commit is behaving - related to your review comment in the head logic.

@ffakenz ffakenz force-pushed the side-load-snapshot branch from 0f21cd7 to 9f6a7ea Compare March 13, 2025 10:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: In review 👀
Development

Successfully merging this pull request may close these issues.

Side load of fully-signed snapshot Heads stuck in a state without being able to progress snapshots
2 participants