Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: add random walk component #2501

Merged
merged 11 commits into from
Apr 26, 2024
Merged

feat: add random walk component #2501

merged 11 commits into from
Apr 26, 2024

Conversation

achingbrain
Copy link
Member

To allow finding network services, add a random walk component that lets services find random network peers in a scalable way.

If two services try to random walk at the same time, they will share the results.

Change checklist

  • I have performed a self-review of my own code
  • I have made corresponding changes to the documentation if necessary (this includes comments as well)
  • I have added tests that prove my fix is effective or that my feature works

To allow finding network services, add a random walk component that
lets services find random network peers in a scalable way.

If two services try to random walk at the same time, they will share
the results.
@achingbrain achingbrain requested a review from a team as a code owner April 24, 2024 12:27
Copy link
Member

@SgtPooki SgtPooki left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the code/comments could be improved a little bit to make it more clear what's happening in walk and startWalk

Comment on lines 50 to 72
if (!this.walking) {
this.startWalk()
}

this.walkers++

let deferred = pDefer<PeerInfo>()
const onPeer = (event: CustomEvent<PeerInfo>): void => {
deferred.resolve(event.detail)
}

this.addEventListener('walk:peer', onPeer)

try {
while (true) {
this.needNext = pDefer()

const peerInfo = await deferred.promise
deferred = pDefer()

yield peerInfo
this.needNext.resolve()
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this flow is slightly confusing and could use an explainer jsdoc comment.

  1. startWalk is called, and iterates over peerRouting.getClosestPeers
    • emits "walk:peer" event
    • races needNext against signal if there's only a single consumer
  2. create parent deferred promise that resolves to emitted "walk:peer" event detail.
  3. listen for "walk:peer" event
  4. loop forever (until abort or other thrown I assume):
    • create deferred promise at needNext
    • await the parent deferred promise & set new parent
    • yield the peerInfo to consumers
    • resolve needNext

This seems overly complicated and a lot of indirection to get a consumer aware async generator?

Can we change startWalk to getPeer where we use something like it-take to consume only a single peer from getClosestPeers when asked?

It seems like startWalk could benefit from rephrasing it as a "peerFaucet" that you turn on/off depending on if there are consumers of the walk generator.

What am I missing...?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

loop forever (until abort or other thrown I assume)

Not exactly, the loop will exit if the consumer stops iterating over the returned generator.

Given this generator:

async function * generator () {
  let val = 0
  console.info('start loop')

  try {
    while (true) {
      console.info('in loop')
      yield val++
      console.info('after yield')
    }
  } finally {
    console.info('out of loop')
  }
}

...if I just consume the first value and then break out of the loop:

for await (const val of generator()) {
  console.info('got value', val)
  break
}

The output is:

start loop
in loop
got value 0
out of loop

You'll notice that "after yield" is never logged.

This seems overly complicated and a lot of indirection to get a consumer aware async generator?

It's a multiple-consumer aware generator. That is, if there's one consumer, we'll advance the generator at the speed that matches the consumer. If there are multiple consumers the generator will be advanced at the speed of the fastest consumer.

They may share a value if they pull at the same speed, or one may miss values if it pulls slowly. This is fine because the peers are randomly chosen so the value stream can be non-deterministic.

Can we change startWalk to getPeer where we use something like it-take to consume only a single peer from getClosestPeers when asked?

I'd rather not because this will create a new generator and a new random query for every consumer. This is exactly what I'm trying to avoid here because queries like this are expensive.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've added some more comments, hopefully it's clearer now.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've also swapped out the deferred promise for racing success/error events against a signal. Now the user can abort before receiving a value, or they can break out of the loop after receiving a value.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that helps a lot. thanks so much for the explainers

packages/libp2p/test/core/random-walk.spec.ts Outdated Show resolved Hide resolved
packages/libp2p/test/core/random-walk.spec.ts Outdated Show resolved Hide resolved
Comment on lines 87 to 94
peerRouting.getClosestPeers
.onFirstCall().returns(async function * () {
yield randomPeer1
}())
.onSecondCall().returns(async function * () {
yield randomPeer2
}())
.onThirdCall().returns(slowIterator())
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Q on stubbing async generators cause I always seem to run into issues here.

Can we just callsFake = () => { yield randomPeer1; yield randomPeer2; } ?

Copy link
Member Author

@achingbrain achingbrain Apr 25, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ordinarily yes, though more like:

peerRouting.getClosestPeers
  .callsFake(async function * () {
    yield randomPeer1
    yield randomPeer2
  })

..but not in this test because here we want to assert that the query has been run twice, so we stub multiple invocations.

Though I guess we could use callsFake instead of an IIFE.

packages/libp2p/test/core/random-walk.spec.ts Outdated Show resolved Hide resolved
Copy link
Member

@SgtPooki SgtPooki left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm. thank you for the clarification!

@achingbrain achingbrain merged commit 998fcaf into main Apr 26, 2024
22 checks passed
@achingbrain achingbrain deleted the feat/add-random-walk branch April 26, 2024 16:50
@achingbrain achingbrain mentioned this pull request Apr 26, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants