-
Notifications
You must be signed in to change notification settings - Fork 87
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Side load of fully-signed snapshot #1858
Comments
As discussed, there are two quirks here:
That said, I think there are still two ways/reasons to proceed here:
Regarding this final point, @ffakenz do you want to take a shot at writing up the gist of a test that would do that? And we can at least see if we can get to a failure that this feature could resolve? |
@ch1bo @noonio Specifically:
The advantages of this approach are that peers' local ledger states never diverge and peers' mempools are regularly flushed of stale/invalid txs. |
I think that Hydra's current approach of peers maintaining local unconfirmed ledger states is a redundant remnant from Cardano L1. On L1, every peer applies every tx to its local ledger state as soon as it is received. This is done because the peer needs to know immediately whether the tx is valid and should be gossiped further or the tx is invalid and should be suppressed. In Hydra's L2 protocol, every tx is directly broadcast to all peers regardless of its validity, so there's no need to immediately decide whether it is valid. |
@GeorgeFlerovsky Good points and some of them cross my mind too. The current implementation has not departed from the original paper because of dubious fear in accidentally breaking the original consensus protocol. Maybe it's time to be more brave and make things better by-design as you suggest. I know where the need for a local ledger view comes from though: in the original design, there was not a round-robin leader, but anyone could propose snapshots - as often or rarely as they want. For this, each participant would maintain a current view of the world, which is especially important if snapshots are not done after each tx. In summary, we departed in two ways from this:
I like the mempool way of putting things, as it would hint at specifying the off-chain protocol in a more robust-by-design way of propagating information (i.e. not using a reliable |
@ch1bo Yup, makes sense. As a starting point for your protocol writeup, please take a look at how we described the offchain consensus protocol in the Hydrozoa spec (§ 5): We'll likely have a newer version in the next two weeks (incorporating feedback from ~120 comments in the discussion), but it should already give you a clear idea of our protocol. If you or your team have any feedback/questions, please comment in the same discussion. |
Theoretically, ReqTx could be sent only to the next K snapshot leaders (K > 1 for robustness) instead of all peers. The snapshot leader who affirms/rejects the transaction would then include the whole transaction in the snapshot, not just the tx hash. However, I think it's more optimal for the transaction submitter to multicast ReqTx to all peers, for these reasons:
Indeed, if we drop the reliable multicast assumption for ReqTx, then any peer that reaches timeout before receiving an L2 transaction mentioned by a snapshot can request the missing L2 transactions from the snapshot leader. However, we'll keep the assumption in Hydrozoa for now, as I'm sure that many other nuances will arise if we drop it. |
Todo:
Need to find out how to construct these "diverging views" and how to resolve (pumba sets? Maybe, if any still fail after raft!)Description
Processing transactions in a Hydra head requires each node to agree on transactions. The protocol will validate transactions (on NewTx command) against it's local view of the ledger state, using the passed --ledger-protocol-parameters. As transactions can be valid or invalid based on configuration (or to some extent exact build versions of hydra-node), it is possible that one node accepts a transaction, while the peer nodes do not.
Currently, this means that the node which accepted the transaction now has a different local state than the other nodes and might try to spend outputs that other nodes don't see available. For example, when using hydraw, the node would be using outputs introduced by the previous pixel paint transaction, but other nodes will deem any new transaction invalid with a BadInputs error.
Within this feature, we want to improve the UX of hydra-node in presence of such misalignments.
Note: We should only adopt snapshots that are enforceable on L1.
Suggested solution
Allow adoption of a new snapshot This snapshot has to be:
number
andversion
strictly bigger than previous.Allow introspection of the current snapshot in a particular node
Work out what constraints are required to accept a new snapshot
What
In this setting, every peer has to manually cooperate by posting a command to reset their local state to a previous snapshot confirmed.
Scenarios
A configuration discrepancy (like: --ledger-protocol-parameters) arises after Head is open, examples maxTxSize/maxTxExecutionUnits could be good for for the first NewTx but too much for the second, making a peer come in disagreement.
Having a peer going offline for too long and missing to catchup or resending AckSn.
Additional context
Compared to #1284, the solution does not depend on how long a peer becomes offline.
Here when a peer becomes back online, the networking layer will make sure to catch up the reconnecting peer, but if someone clear its pending txs while doing so, will create a worse scenario, where parties end up in different confirmed snapshots.
The text was updated successfully, but these errors were encountered: