Revert "consensus/bor: fetch validator set using parent hash" #1440
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Reverts #1384
The change was done to use state from previous header instead of the last canonical header as that was useful for side chain imports. Seems like even that can cause some issues.
There were lot of peer drops recently on mainnet which led to this being the root cause. Whenever a chain is being imported, headers are sent to consensus engine for simple verification.
bor/core/blockchain.go
Line 2101 in 51af82c
In consensus, verification happens in background as below
bor/consensus/bor/bor.go
Lines 317 to 327 in 51af82c
Header verification is supposed to be state-less (i.e. can be done independently without having any info of previous block in state) and hence this works in most of the cases. But, we modified this behaviour and used
header.ParentHash
to make aneth_call
against it's state to fetch validator set and compare it with header. Becauseengine.VerifyHeaders
is called very early on inInsertChain
function, there are very less chances that the parent header has been processed till then. This leads to an errorheader for hash not found
which goes all the way till downloader and drops the peer.E.g. if we're receiving a chain from
[n, n+10]
, we'll verify headers separately (calling it process A) and process blocks one by one (calling it process B). A forn
passes asn-1
is canonical and written to disk. We go to perform B forn
. Meanwhile, we also go to do A forn+1
as it's parallel. Untiln
doesn't complete B, the eth call forn+1
will fail asn
is not written to disk yet. Note that this can happen to any of the block and not in first one. This will cause the whole import to fail and lead to peer drop.For now, we're reverting this to what it was previously as it covers pretty much all the cases. Will create a separate PR to handle everything post discussion.