-
Notifications
You must be signed in to change notification settings - Fork 690
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Kusama Slash Era 7126 - What went wrong? #5778
Comments
What's also a bit odd is that the slash was reported for era 7126 which is 6 eras (36hrs ago) but the slash report was made approximately 10hrs ago |
Also odd that you get a slash while in a sync state, so not yet actively validating blocks. Almost as if the finality worker was using old info (from when the node went down, some kind of buffer?) and tried something with that old info against the already finalized chain? Hopefully someone with more expertise in this area can take a look. |
Looking at the code it seems we have tried to prevote again on the |
Thanks Basti, I'll register a referendum to cancel the slash even if the vote has no on-chain consequence. |
@paradox-tt It's not evident from the logs but how was the node shut down? Was it a clean shut down? The node persists the block it voted for in order to prevent equivocating in case it's restarted, but if the node isn't shut down properly (e.g. the host machine is powered off), it could be that this isn't properly persisted. |
The logs seem to indicate that it wasn't a clean shutdown. If @paradox-tt can confirm this, the vote was probably not persisted. |
@andresilva I can confirm the shutdown was NOT clean. Just asking, when the node comes back up and is not is syncing should a vote/prevote still be sent? |
On a high-level I would say no, we could detect whether we are major syncing or not and then decide whether the grandpa voter does anything based on that. The reason we are not doing that is that finality and block production aren't tied to each other, so you could be 100 blocks behind while syncing and your vote for finality would still be useful (e.g. if finality was also lagging 100 blocks). It also makes the code simpler since there's less edge cases to think about. And waiting for sync to be finished wouldn't necessarily guarantee that we wouldn't equivocate anyway once it finishes. I think the root issue here is that the information about what the validator voted on wasn't persisted. This can happen because of a dirty shutdown, but we can also improve this situation. Currently it seems that there is no way to explicitly sync the database (and thus make sure all data is flushed to disk), if we add support for this then the situation should be resolved, because only after being sure that the vote information is actually persisted on disk, do we push the vote to the network. Assuming this exists if the node crashes then we are sure that: either the node didn't vote OR if it voted and the vote reached the network then the data was definitely persisted on disk, and therefore upon restarting it will not vote for something else. (Off-topic: FWIW I would personally vote in favor of reverting the slash.) |
Hey guys,
I endured some downtime on one of my Kusama nodes, when it restarted it received a slash due to equivocation with details below.
The slashed amount is fairly low but I would still like to have the technicalities looked at to see if this is a bug or user error. If it's a bug I would then seek to have the slash revoked and if not I'll adjust my proceedures to prevent reoccurence.
The text was updated successfully, but these errors were encountered: