Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

JSON-RPC: chainHead_v1_follow Subscription Sends newBlock events with unannounced parentBlockHash #5761

Closed
josepot opened this issue Sep 18, 2024 · 4 comments · Fixed by #5856
Assignees
Labels
I10-unconfirmed Issue might be valid, but it's not yet known.

Comments

@josepot
Copy link

josepot commented Sep 18, 2024

The chainHead_v1_follow subscription is not compliant with the specification because it sometimes sends newBlock events where the parentBlockHash has not been previously announced. According to the specification:

parentBlockHash is guaranteed to be equal either to the current finalized block hash, or to a block reported in an earlier newBlock event.

Issue Details

While testing against the endpoint wss://sys.dotters.network/paseo, we observed that some blocks are missing from the newBlock events sequence. Specifically:

  1. Block #3012893 with hash 0x2b8cbe6887a3e2869d462920d784b604a4683751aea1f2e10f02f97c9333abae is announced.
  2. The next announced block is #3012896 with hash 0x0e95ce16ed5cbf2008bf00c5e1a65736a8cd00c6dcf08da17126d216f79605d9.

The two missing blocks are:

  • Block #3012894 with hash 0x5c7ed91698a589a1b2ec120b6bea82db88ef0ecc4abc7c25d491d9a10b165841
  • Block #3012895 with hash 0x84654e06c8a6d4f0793b377c84cd80d5e802e227636a6b7bc52f1088863447f7

As a result, the parentBlockHash of block #3012896 refers to an unannounced block, which violates the specification's guarantee.

This issue causes the PAPI client to crash.

Logs

I've attached the logs capturing this behavior: pre.log

Steps to Reproduce

  1. Install Bun:

    curl -fsSL https://bun.sh/install | bash
  2. Set up the project:

    mkdir paseo-test
    cd paseo-test
    bun init -y
    bun install polkadot-api
    bun papi add -w wss://sys.dotters.network/paseo pas
  3. Create a file named paseo.ts with the following content:

    import { openSync, writeSync } from "node:fs";
    import { createClient } from "polkadot-api";
    import { getWsProvider } from "polkadot-api/ws-provider/web";
    import { pas } from "@polkadot-api/descriptors";
    import { withLogsRecorder } from "polkadot-api/logs-provider";
    import { fixUnorderedBlocks, parsed } from "polkadot-api/polkadot-sdk-compat";
    
    const pre = openSync("pre-enhancer.log", "w");
    const post = openSync("post-enhancer.log", "w");
    const rpcEnhancer = parsed(fixUnorderedBlocks);
    
    const client = createClient(
      withLogsRecorder(
        (log) => {
          writeSync(post, `${log}\n`);
        },
        rpcEnhancer(
          withLogsRecorder((log) => {
            writeSync(pre, `${log}\n`);
          }, getWsProvider("wss://sys.dotters.network/paseo")),
        ),
      ),
    );
    
    const api = client.getTypedApi(pas);
    
    api.query.System.Account.watchValue(
      "15roJ4ZrgrZam5BQWJgiGHpgp7ShFQBRNLq6qUfiNqXDZjMK",
    ).subscribe(
      (x) => {
        console.log("Account change", x.data);
      },
      console.error,
    );
    
    client.finalizedBlock$.subscribe(console.log, console.error);
  4. Run the script:

    bun run paseo.ts

Impact

This issue disrupts the integrity of the block sequence received via the subscription, making it challenging to process data accurately. It hinders our ability to maintain a consistent view of the blockchain state, as we rely on the guarantee provided by the specification regarding parentBlockHash.

Expected Behavior

Each newBlock event's parentBlockHash should either be:

  • Equal to the current finalized block hash, or
  • A block hash that was reported in an earlier newBlock event.

Actual Behavior

The newBlock event for block #3012896 references a parentBlockHash that was neither finalized nor previously announced via a newBlock event.

Request

Please investigate why the chainHead_v1_follow subscription is emitting newBlock events with parentBlockHash values that have not been previously announced.

cc: @jsdw @lexnv

@josepot
Copy link
Author

josepot commented Sep 19, 2024

I was able to reproduce the same issue against wss://westend-rpc.polkadot.io. The logs:
pre-enhancer.log

@josepot
Copy link
Author

josepot commented Sep 20, 2024

I recently reviewed new logs that provide additional insight into this issue. Specifically, in the logs, when block 0xb3ff4aa8b402e64414611090c6903f0a1859519121f1d512a0590a91568494ac is announced, it references a parentBlockHash (0x552f18a597f1cd37a5fd6c102259714f17786deeb6f672784c23ff73ea775a0d) that had not yet been announced. Later, a fork of 2 blocks that reference 0xb3ff4aa8b402e64414611090c6903f0a1859519121f1d512a0590a91568494ac are announced, but the missing block remains unreported.

It seems that at this point, the server detects that its state is corrupted and triggers a stop event.

This is not an isolated case; I observed a similar pattern yesterday. After the server sent blocks with missing parentBlockHash values, it followed up with a stop event.

One notable point is that after the stop event, the client immediately re-subscribes, and in the next initialized event, the final block in the list of finalizedBlockHashes is the block that was missing from the previous subscription.

EDIT: in those logs you will see other instances of the "stop" event. However, please know that the instances that have an extra property like "eventType": "internal" are events generated by the library due to an abrupt disconnection from the WebSocket connection. We have a middleware that handles these kinds of situations so that the client can keep on working (and it does).

@lexnv
Copy link
Contributor

lexnv commented Sep 20, 2024

Thanks @josepot for the detailed report here! And apologies again for the inconvenience 🙏

Looking briefly at the code, the importing of NewBlocks only checks at the moment if the NewBlock hash was pinned previously. However, it does not check whether the notification.parent_hash is known.

if !self.sub_handle.pin_block(&self.sub_id, notification.hash)? {
return Ok(Default::default())
}

In those cases, we'd need to extend the state corruption detection to take this into account:

B0 -> B1 -> B2
      B1 -> (unknown number of blocks ..) -> B8

In this example, block B8 is reported as NewBlock with parent B7. The issue is that [B2..B7] are not announced yet.

@Polkadot-Forum
Copy link

This issue has been mentioned on Polkadot Forum. There might be relevant details there:

https://forum.polkadot.network/t/polkadot-api-updates-thread/7685/15

github-merge-queue bot pushed a commit that referenced this issue Oct 7, 2024
There are cases during warp sync or re-orgs, where we receive a
notification with a block parent that was not reported in the past. This
PR extends the tracking state to catch those cases and report a `Stop`
event to the user.

This PR adds a new state to the RPC-v2 chainHead to track which blocks
have been reported.

In the past we relied on the pinning mechanism to provide us details if
a block is pinned or not.
However, the pinning state keeps the minimal information around for
pinning. Therefore, unpinning a block will cause the state to disappear.

Closes: #5761

---------

Signed-off-by: Alexandru Vasile <[email protected]>
Co-authored-by: Sebastian Kunert <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
I10-unconfirmed Issue might be valid, but it's not yet known.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants