Add EventLoop APIs for simpler scheduling of callbacks #2759

simonjbeaumont · 2024-06-28T14:22:24Z

Motivation

The current scheduleTask APIs make use of both callbacks and promises, which leads to confusing semantics. For example, on cancellation, users are notified in two ways: once via the promise and once via the callback. Additionally the way the API is structured results in unavoidable allocations—for the closures and the promise—which could be avoided if we structured the API differently.

Modifications

This PR introduces new protocol requirements on EventLoop:

protocol EventLoop {
    // ...
    @discardableResult
    func scheduleCallback(at deadline: NIODeadline, handler: some NIOScheduledCallbackHandler) throws -> NIOScheduledCallback

    @discardableResult
    func scheduleCallback(in amount: TimeAmount, handler: some NIOScheduledCallbackHandler) throws -> NIOScheduledCallback

    func cancelScheduledCallback(_ scheduledCallback: NIOScheduledCallback)
}

Default implementations have been provided that call through to EventLoop.scheduleTask(in:_:) to not break existing EventLoop implementations, although this implementation will be (at least) as slow as using scheduleTask(in:_:) directly.

The API is structured to allow for EventLoop implementations to provide a custom implementation, as an optimization point and this PR provides a custom implementation for SelectableEventLoop, so that MultiThreadedEventLoopGroup can benefit from a faster implementation.

Finally, this PR adds benchmarks to measure the performance of setting a simple timer using both scheduleTask(in:_:) and scheduleCallback(in:_:) APIs using a MultiThreadedEventLoopGroup.

Result

A simpler and more coherent API surface.

There is also a small performance benefit for heavy users of this API, e.g. protocols that make extensive use of timers: when using MTELG to repeatedly set a timer with the same handler, switching from scheduleTask(in:_:) to scheduleCallback(in:_:) reduces almost all allocations (and amortizes to zero allocations) and is ~twice as fast.

MTELG.scheduleCallback(in:_:)
╒═══════════════════════╤═════════╤═════════╤═════════╤═════════╤═════════╤═════════╤═════════╤═════════╕
│ Metric                │      p0 │     p25 │     p50 │     p75 │     p90 │     p99 │    p100 │ Samples │
╞═══════════════════════╪═════════╪═════════╪═════════╪═════════╪═════════╪═════════╪═════════╪═════════╡
│ Malloc (total) *      │       0 │       0 │       0 │       0 │       0 │       0 │       0 │    1109 │
╘═══════════════════════╧═════════╧═════════╧═════════╧═════════╧═════════╧═════════╧═════════╧═════════╛

MTELG.scheduleTask(in:_:)
╒═══════════════════════╤═════════╤═════════╤═════════╤═════════╤═════════╤═════════╤═════════╤═════════╕
│ Metric                │      p0 │     p25 │     p50 │     p75 │     p90 │     p99 │    p100 │ Samples │
╞═══════════════════════╪═════════╪═════════╪═════════╪═════════╪═════════╪═════════╪═════════╪═════════╡
│ Malloc (total) *      │       4 │       4 │       4 │       4 │       4 │       4 │       4 │     576 │
╘═══════════════════════╧═════════╧═════════╧═════════╧═════════╧═════════╧═════════╧═════════╧═════════╛

…ventLoop

…Order:)

simonjbeaumont · 2024-07-01T10:57:37Z

@swift-server-bot test this please

Sources/NIOCore/EventLoop.swift

Sources/NIOCore/NIOTimer.swift

Sources/NIOCore/EventLoop.swift

Sources/NIOCore/NIOTimer.swift

Sources/NIOCore/EventLoop.swift

Sources/NIOCore/NIOTimer.swift

Sources/NIOCore/EventLoop.swift

Sources/NIOCore/NIOTimer.swift

Sources/NIOPosix/SelectableEventLoop.swift

Benchmarks/Benchmarks/NIOPosixBenchmarks/Benchmarks.swift

Sources/NIOCore/NIOTimer.swift

Lukasa · 2024-07-04T13:19:31Z

Sources/NIOCore/EventLoop.swift

+
+    /// Set a timer that will call a handler at the given time.
+    @discardableResult
+    func setTimer(for deadline: NIODeadline, _ handler: any NIOTimerHandler) -> NIOTimer


Let's add another bikeshed: is there any reason not to make this method generic?

Done in 06a4ce7. Although I didn't see much improvement from it.

Sources/NIOCore/NIOTimer.swift

Lukasa · 2024-07-04T13:24:06Z

Sources/NIOCore/EventLoop.swift

+
+    /// Set a timer that will call a handler after a given amount of time.
+    @discardableResult
+    func setTimer(for duration: TimeAmount, _ handler: any NIOTimerHandler) -> NIOTimer


While I'm here, can I suggest that we should use different labels instead of for in both? That will help with tab-completion a bit.

Hm, it felt OK to me. Would you prefer the following?

setTimer(for: TimeAmount, ...)

setTimer(forDeadline: NIODeadline, ...)

I'm thinking of how I set timers verbally on my phone I normally always use "set a timer for" regardless of whether I then say a duration or an absolute time.

Open to suggestions here, though :)

setTimer(for:) and setTimer(at:) is probably the easiest spelling distinction.

https://github.com/apple/swift-nio/pull/2759/files/df3cdb1c1f1f3ce8fb66fd81cdfa0cef3318c984#r1664313726 resulted in new API spellings. But these do use different prepositions for the NIODeadline and TimeAmount variants: scheduleCallback(at:handler:) and scheduleCallback(in:handler:), respectively.

Sources/NIOCore/NIOTimer.swift

This reverts commit 0d9a6f9d6bb4add42e02127daddd01e00d0e6b6d.

…quirement

simonjbeaumont

Thanks for the feedback so far folks. I've addressed it all and attempted to do so in a series of targeted commits that can be squashed on merge.

//cc @Lukasa @glbrntt @FranzBusch

simonjbeaumont · 2024-07-08T15:38:11Z

Sources/NIOCore/EventLoop.swift

+
+    /// Set a timer that will call a handler after a given amount of time.
+    @discardableResult
+    func setTimer(for duration: TimeAmount, _ handler: any NIOTimerHandler) -> NIOTimer


https://github.com/apple/swift-nio/pull/2759/files/df3cdb1c1f1f3ce8fb66fd81cdfa0cef3318c984#r1664313726 resulted in new API spellings. But these do use different prepositions for the NIODeadline and TimeAmount variants: scheduleCallback(at:handler:) and scheduleCallback(in:handler:), respectively.

Sources/NIOCore/NIOTimer.swift

simonjbeaumont · 2024-07-08T15:42:28Z

Sources/NIOCore/EventLoop.swift

+
+    /// Set a timer that will call a handler at the given time.
+    @discardableResult
+    func setTimer(for deadline: NIODeadline, _ handler: any NIOTimerHandler) -> NIOTimer


Done in 06a4ce7. Although I didn't see much improvement from it.

Sources/NIOCore/NIOTimer.swift

glbrntt

Not sure about the method name for NIOScheduledCallbackHandler but beyond that this looks good aside from some nits.

Sources/NIOCore/EventLoop.swift

glbrntt · 2024-07-08T15:57:31Z

Sources/NIOCore/NIOScheduledCallback.swift

+    /// This function is called at the scheduled time, unless the scheduled callback is cancelled.
+    ///
+    /// - Parameter eventLoop: The event loop on which the callback was scheduled.
+    func onSchedule(eventLoop: some EventLoop)


The naming reads (to me) like this function is called at the time when the callback is scheduled (when eventLoop.scheduledCallback is called), as opposed to the time when the callback is scheduled to run.

handleScheduledCallback?

I've updated the name to handleScheduledCallback in 2ec5543.

Question: do we need to be more considerate about name clashes since we're expecting folks to conform their types to this protocol; e.g. should this be something like handleNIOScheduledCallback?

This case isn't quite covered by https://github.com/apple/swift-nio/blob/main/docs/public-api.md, but it seems similar in nature.

WDYT?

@glbrntt did you have any thoughts regarding the namespacing or is it fine the way it is?

The function takes a NIO type, so at worst it could cause an overload? Maybe it's OK the way it is now.

Sources/NIOPosix/SelectableEventLoop.swift

Tests/NIOPosixTests/NIOScheduledCallbackTests.swift

…duledCallback

…ocC refs

simonjbeaumont

Thanks @glbrntt for your latest round of review. I have addressed most of the feedback. I wanted some additional thoughts on these two though:

Tests/NIOPosixTests/NIOScheduledCallbackTests.swift

Sources/NIOPosix/SelectableEventLoop.swift

Sources/NIOCore/EventLoop.swift

Sources/NIOPosix/SelectableEventLoop.swift

simonjbeaumont · 2024-07-10T08:10:26Z

Sources/NIOCore/NIOScheduledCallback.swift

+    /// This function is called at the scheduled time, unless the scheduled callback is cancelled.
+    ///
+    /// - Parameter eventLoop: The event loop on which the callback was scheduled.
+    func onSchedule(eventLoop: some EventLoop)


I've updated the name to handleScheduledCallback in 2ec5543.

Question: do we need to be more considerate about name clashes since we're expecting folks to conform their types to this protocol; e.g. should this be something like handleNIOScheduledCallback?

This case isn't quite covered by https://github.com/apple/swift-nio/blob/main/docs/public-api.md, but it seems similar in nature.

WDYT?

glbrntt

The try! needs fixing but otherwise I'm happy with this!

Benchmarks/Benchmarks/NIOPosixBenchmarks/Benchmarks.swift

Sources/NIOCore/NIOScheduledCallback.swift

Sources/NIOPosix/SelectableEventLoop.swift

Sources/NIOPosix/MultiThreadedEventLoopGroup.swift

simonjbeaumont · 2024-07-25T15:20:55Z

OK, cancellation support added. FYI @glbrntt @Lukasa @FranzBusch

weissi · 2024-07-26T06:22:52Z

Motivation

Scheduling a timer is currently an expensive operation because this is done using EventLoop.scheduleTask(in:_:), which results in several allocations. A simple timer use case does not typically need all the functionality of this API and use cases that set many timers (e.g. for timeouts) pay this cost repeatedly.

@simonjbeaumont Could you add information here on why EventLoop.scheduleTask results in more allocations than this? It shouldn't, maybe that's fixable? EventLoop.scheduledTask is meant to kinda be the simplest timer API but the PR contrasts it with "simple timers" which suggests that I'm missing something here.

Also how many allocations are we talking? And do you have something real-worldish where that difference is observable?

simonjbeaumont · 2024-07-26T08:00:57Z

Motivation

Scheduling a timer is currently an expensive operation because this is done using EventLoop.scheduleTask(in:_:), which results in several allocations. A simple timer use case does not typically need all the functionality of this API and use cases that set many timers (e.g. for timeouts) pay this cost repeatedly.

@simonjbeaumont Could you add information here on why EventLoop.scheduleTask results in more allocations than this? It shouldn't, maybe that's fixable? EventLoop.scheduledTask is meant to kinda be the simplest timer API but the PR contrasts it with "simple timers" which suggests that I'm missing something here.

Great questions, and sorry if the write up in the PR description wasn't detailed enough.

IIUC, scheduleTask will allocate a bunch because it takes closures and needs promises. For cases where you just want a callback to an object you own, e.g. for setting repeated timers, this can be wasteful.

Also how many allocations are we talking?

I have added benchmarks and their results to the PR description. We're talking 4 allocations for using scheduleTask and none for scheduleCallback when using SelectableEventLoop.

Note that the default implementation for event loops of scheduleCallback is backed by scheduleTask so there won't be a win here, but it provides an opportunity for event loops to provide a custom implementation, which this PR has for SelectableEventLoop.

And do you have something real-worldish where that difference is observable?

A real-world use case would be something like QUIC, where IIUC it makes use of lots of timers in the protocol. I also think that grpc-swift is likely to benefit from this.

One thing this discussion has prompted me to do is to benchmark repeated scheduleCallback vs using scheduleRepeatedTask to see if there is some amortised benefit to the latter that I've overlooked.

FranzBusch · 2024-07-26T08:04:54Z

@simonjbeaumont Could you add information here on why EventLoop.scheduleTask results in more allocations than this? It shouldn't, maybe that's fixable? EventLoop.scheduledTask is meant to kinda be the simplest timer API but the PR contrasts it with "simple timers" which suggests that I'm missing something here.

FWIW, I tried to make EL.scheduleTask as cheap as possible in the past already and tried to minimise allocations. IRCC I hit a point where I couldn't remove any more of the allocations because that would have had public API impact. One of them is for example the future that we allocate which we can completely avoid with this new API.

Lukasa · 2024-07-26T10:50:00Z

Candidly, there's almost no way to make scheduleTask fast in the current API. scheduleTask is required to return a Scheduled<T>, which looks like this:

swift-nio/Sources/NIOCore/EventLoop.swift

Lines 26 to 53 in 28f9cae

    
           public struct Scheduled<T> { 
        
               @usableFromInline typealias CancelationCallback = @Sendable () -> Void 
        
               @usableFromInline let _promise: EventLoopPromise<T> 
        
               @usableFromInline let _cancellationTask: CancelationCallback 
        
               @inlinable 
        
               @preconcurrency 
        
               public init(promise: EventLoopPromise<T>, cancellationTask: @escaping @Sendable () -> Void) { 
        
                   self._promise = promise 
        
                   self._cancellationTask = cancellationTask 
        
               } 
        
               /// Try to cancel the execution of the scheduled task. 
        
               /// 
        
               /// Whether this is successful depends on whether the execution of the task already begun. 
        
               ///  This means that cancellation is not guaranteed. 
        
               @inlinable 
        
               public func cancel() { 
        
                   self._promise.fail(EventLoopError._cancelled) 
        
                   self._cancellationTask() 
        
               } 
        
               /// Returns the `EventLoopFuture` which will be notified once the execution of the scheduled task completes. 
        
               @inlinable 
        
               public var futureResult: EventLoopFuture<T> { 
        
                   self._promise.futureResult 
        
               } 
        
           }

This has the minimum cost of 1 promise, which is 1 allocation. However, in SelectableEventLoop it requires two allocations, as our cancellationTask closes over self and a taskId, so we have to heap-allocate the closure context:

swift-nio/Sources/NIOPosix/SelectableEventLoop.swift

Lines 322 to 331 in 28f9cae

    
           cancellationTask: { 
        
               self._tasksLock.withLock { () -> Void in 
        
                   self._scheduledTasks.removeFirst(where: { $0.id == taskId }) 
        
               } 
        
               // We don't need to wake up the selector here, the scheduled task will never be picked up. Waking up the 
        
               // selector would mean that we may be able to recalculate the shutdown to a later date. The cost of not 
        
               // doing the recalculation is one potentially unnecessary wakeup which is exactly what we're 
        
               // saving here. So in the worst case, we didn't do a performance optimisation, in the best case, we saved 
        
               // one wakeup. 
        
           }

But it's worse! The internal datastructure that SelectableEventLoop uses to store the scheduled task is a ScheduledTask (natch). That looks like this:

swift-nio/Sources/NIOPosix/MultiThreadedEventLoopGroup.swift

Lines 510 to 534 in 28f9cae

    
           internal struct ScheduledTask { 
        
               /// The id of the scheduled task. 
        
               /// 
        
               /// - Important: This id has two purposes. First, it is used to give this struct an identity so that we can implement ``Equatable`` 
        
               ///     Second, it is used to give the tasks an order which we use to execute them. 
        
               ///     This means, the ids need to be unique for a given ``SelectableEventLoop`` and they need to be in ascending order. 
        
               @usableFromInline 
        
               let id: UInt64 
        
               let task: () -> Void 
        
               private let failFn: (Error) -> Void 
        
               @usableFromInline 
        
               internal let readyTime: NIODeadline 
        
               @usableFromInline 
        
               init(id: UInt64, _ task: @escaping () -> Void, _ failFn: @escaping (Error) -> Void, _ time: NIODeadline) { 
        
                   self.id = id 
        
                   self.task = task 
        
                   self.failFn = failFn 
        
                   self.readyTime = time 
        
               } 
        
               func fail(_ error: Error) { 
        
                   failFn(error) 
        
               } 
        
           }

Note that we have two more closures. How many of them heap allocate? At least one:

swift-nio/Sources/NIOPosix/SelectableEventLoop.swift

Lines 306 to 312 in 28f9cae

    
           { 
        
               do { 
        
                   promise.succeed(try task()) 
        
               } catch let err { 
        
                   promise.fail(err) 
        
               } 
        
           },

As all closures over closures must allocate.

All this scaffolding means that this timer cannot help but cause 3 allocations. None of these can go away without losing the ability to handle cancellation, or breaking the existing API surface: the promise is clearly necessary, the cancel function is API and can't be removed (though we could do a deprecation cycle to remove it, probably), and the mere existence of the promise in alloc 1 forces alloc 3.

Additionally, the nature of .scheduleTask as an API encourages you to take another allocation, because it takes a closure. With careful design you can make that go away (by closing only over a class reference), but most users won't notice this.

weissi · 2024-07-26T11:53:11Z

Candidly, there's almost no way to make scheduleTask fast in the current API.

Would it be worth thinking about improving the API with something that is very similar but doesn't suffer all these issues?

I have added benchmarks and their results to the PR description. We're talking 4 allocations for using scheduleTask and none for scheduleCallback when using SelectableEventLoop.

Right, in a naive benchmark, 4 mallocs & 4 frees of allocations that are small and don't last very long is 59 nanoseconds on my machine. Please don't interpret that as saving allocations is not worthwhile, quite the opposite! But inventing duplicate APIs should be well motivated in my opinion.

// run as swiftc -O test.swift && time ./test
import Darwin

@inline(never)
@_optimize(none)
func blackhole(
    _ a: UnsafeMutableRawPointer,
    _ b: UnsafeMutableRawPointer,
    _ c: UnsafeMutableRawPointer,
    _ d: UnsafeMutableRawPointer
) {
    free(a)
    free(b)
    free(c)
    free(d)
}

for _ in 0..<1_000_000_000 {
    guard let a = malloc(24),
    let b = malloc(32),
    let c = malloc(48),
    let d = malloc(48) else {
        preconditionFailure()
    }

    blackhole(a, b, c, d)
}
// $ swiftc -O test.swift && time ./test
//
// real	0m59.349s
// user	0m58.970s
// sys	0m0.360s

A real-world use case would be something like QUIC, where IIUC it makes use of lots of timers in the protocol. I also think that grpc-swift is likely to benefit from this.

I meant a real world use case where the ScheduledTask overhead is large enough that it's actually observable. I understand that QUIC, gRPC and many others use a lot of scheduled tasks, but 59 ns is very very little. Add an order of magnitude, let's call it 500 ns, that's still very little. So we need to be confident that any processing we would do in the timers doesn't totally dwarf the allocation time.

Lukasa · 2024-07-26T12:06:09Z

Why are we adding only one order of magnitude? Each of these timers is kept per-connection, so the correct order of multiplicand is "number of connections / number of cores".

This API can be made very similar to scheduleTask, by having it take a closure instead. In that case, it's an almost perfect replacement, except that it drops the problematic promise. This does open the user up to the sharp edge of the closure context, but expert users can avoid that sharp edge.

But I'd argue this makes your objection worse, not better. That API is much closer to the existing one than the current proposal, which increases confusion instead of decreasing it. We cannot deprecate the existing API, because it's a protocol requirement: event loops must implement it, and indeed it must be used as the fallback implementation for the new API.

Lukasa · 2024-07-26T12:07:00Z

As a related sidebar, these timers are typically set on the event loop, so they are a source of latency across the loop as a whole: we can't do I/O for any time where we're allocating memory for these timers.

weissi · 2024-07-26T12:29:06Z

Why are we adding only one order of magnitude? Each of these timers is kept per-connection, so the correct order of multiplicand is "number of connections / number of cores".

I meant adding an order of magnitude for each. In a best-case scenario 4 allocs&frees are 59ns (all the four, not each one). I'm suggesting to assume they 4 allocs & 4 frees come in at 500ns (again, sum of 4 allocs & 4 frees).

This API can be made very similar to scheduleTask, by having it take a closure instead. In that case, it's an almost perfect replacement, except that it drops the problematic promise. This does open the user up to the sharp edge of the closure context, but expert users can avoid that sharp edge.

Yeah, that's what I mean, why not add an API that's the same as the current one except for no ScheduledTask return (and other niche things).

But I'd argue this makes your objection worse, not better. That API is much closer to the existing one than the current proposal, which increases confusion instead of decreasing it. We cannot deprecate the existing API, because it's a protocol requirement: event loops must implement it, and indeed it must be used as the fallback implementation for the new API.

Just so I get it right, your argument is that having a new scheduleTaskFireAndForget(in: TimeAmount, body: @escaping () -> Void) -> Void (name tbd) is confusing? I can sympathise with that but I think we can find a resolution.

My thinking was/is: Let's start with a use case that makes an overhead of 50 to 500ns per schedule actually observable, then let's see what that use case needs. Maybe it needs cancellation support, maybe it doesn't, who knows. It makes it easier to design a new API if we actually now precisely what's necessary.

For example: I mostly do need the returned ScheduledTask such that I'm able to clean up. And I also need the .futureResult usually such that I can schedule the next time (if not expressible by repeated schedule task).

The protocol requirement isn't an issue IMHO. The current API is quite expressive and supports a lot of use cases. But I won't deny that there are use cases where we don't need all features. (I have had those too)

As a related sidebar, these timers are typically set on the event loop, so they are a source of latency across the loop as a whole: we can't do I/O for any time where we're allocating memory for these timers.

Completely. But there's much more overhead than just allocations. Literally the only thing I'm suggesting is a motivation that shows this as an actual win. I know it's a theoretical win but just saving the allocations might be small enough that it's not obversable.

And finally: Something that takes 500ns (which is the 10x'd cost of our allocations) can be done 200M per second per core (but that'd fully consume it). So that's a lot.

Lukasa · 2024-07-26T13:36:15Z

I'm struggling a bit with this feedback, because when I read it it seems like you're asking for two separate things, without clarifying whether they're an AND or an OR.

Yeah, that's what I mean, why not add an API that's the same as the current one except for no ScheduledTask return (and other niche things).

The API is almost identical in function to the current one. It's not fire-and-forget. It's not even that we don't return a ScheduledTask: this API supports cancellation by returning a token, it notifies the callee about that cancellation so we can handle EL quiescing, so it's just as capable of being cancelled as the existing API. The only thing it doesn't have is a Promise.

This is because the Promise on the current one is weird. When we complete or cancel a Scheduled right now, we don't tell you once, we tell you twice: once via Promise and once via callback. That's a very confused idea: why are there two ways? These are much more naturally two APIs: one that fires a callback, and one that takes a promise:

protocol EventLoop {
    func scheduleTask(in: TimeAmount, @Sendable @escaping () -> Void, onCancel: @Sendable @escaping () -> Void) -> TaskHandle

    func scheduleTask(in: TimeAmount, notifying promise: EventLoopPromise<Void>) -> TaskHandle
}

struct TaskHandle {
    func cancel()
}

Of course, each of these APIs could easily be implemented in terms of the other.

But the current API is weirdly both of these at the same time. You pass a callback, but you also get a promise. The two are both notified in the same way. That's weird, and hard to justify.

The new API says "screw it, just do the callback". But it acknowledges that we already have an API and it can do quite a lot, so in an attempt to differentiate itself it becomes far more restrictive. Essentially, it forces the user into a pattern that will allow amortized zero-allocation timers, by ensuring that you always give us something to call rather than us just invoking a closure.

But as you know very well, there is no difference between a delegate and a pair of closures, so we could take the pair of closures instead. The downside is that this API gets very close to obviating ScheduledTask, and it gets quite hard for us to tell users when they should use one or the other. I honestly thought the better thing was to provide a more differentiated API, not a less differentiated one.

While we're here, the 59ns argument isn't interesting because Si has already written a benchmark, which is available in his original post but I will reproduce here:

MTELG.scheduleTask(in:_:)
╒═════════════════════════╤═════════╤═════════╤═════════╤═════════╤═════════╤═════════╤═════════╤═════════╕
│ Metric                  │      p0 │     p25 │     p50 │     p75 │     p90 │     p99 │    p100 │ Samples │
╞═════════════════════════╪═════════╪═════════╪═════════╪═════════╪═════════╪═════════╪═════════╪═════════╡
│ Malloc (total) *        │       4 │       4 │       4 │       4 │       4 │       4 │       4 │     520 │
├─────────────────────────┼─────────┼─────────┼─────────┼─────────┼─────────┼─────────┼─────────┼─────────┤
│ Time (total CPU) (ns) * │     447 │     719 │     773 │     834 │     908 │    1098 │    1348 │     520 │
╘═════════════════════════╧═════════╧═════════╧═════════╧═════════╧═════════╧═════════╧═════════╧═════════╛

MTELG.setTimer(for:_:) without NIOCustomTimerImplementation conformance
╒═════════════════════════╤═════════╤═════════╤═════════╤═════════╤═════════╤═════════╤═════════╤═════════╕
│ Metric                  │      p0 │     p25 │     p50 │     p75 │     p90 │     p99 │    p100 │ Samples │
╞═════════════════════════╪═════════╪═════════╪═════════╪═════════╪═════════╪═════════╪═════════╪═════════╡
│ Malloc (total) *        │       5 │       5 │       5 │       5 │       5 │       5 │       5 │     476 │
├─────────────────────────┼─────────┼─────────┼─────────┼─────────┼─────────┼─────────┼─────────┼─────────┤
│ Time (total CPU) (ns) * │     482 │     760 │     823 │     905 │     966 │    1119 │    1364 │     476 │
╘═════════════════════════╧═════════╧═════════╧═════════╧═════════╧═════════╧═════════╧═════════╧═════════╛

MTELG.setTimer(for:_:) with NIOCustomTimerImplementation conformance
╒═════════════════════════╤═════════╤═════════╤═════════╤═════════╤═════════╤═════════╤═════════╤═════════╕
│ Metric                  │      p0 │     p25 │     p50 │     p75 │     p90 │     p99 │    p100 │ Samples │
╞═════════════════════════╪═════════╪═════════╪═════════╪═════════╪═════════╪═════════╪═════════╪═════════╡
│ Malloc (total) *        │       0 │       0 │       0 │       0 │       0 │       0 │       0 │    1071 │
├─────────────────────────┼─────────┼─────────┼─────────┼─────────┼─────────┼─────────┼─────────┼─────────┤
│ Time (total CPU) (ns) * │     178 │     278 │     327 │     383 │     434 │     535 │     664 │    1071 │
╘═════════════════════════╧═════════╧═════════╧═════════╧═════════╧═════════╧═════════╧═════════╧═════════╛

P50 saves 446ns from a 773ns operation, p90 saves 474ns from a 908ns operation, and p99 saves 563ns from a 1.1us operation.

All of these are things that you can get away with doing millions of times in a second, but all of these are things that are likely to be done many thousands of times per second. As an example, QUIC connections will typically set at least two per-packet timers per connection (idle timer and probe timer). Using ScheduledTask in this context gives little room to move, and the EL offers no other timer solution. If we need a cheaper one, we can work out how to do that, but I don't think we have many ways to get cheaper than the implementation offered here.

weissi · 2024-07-29T12:25:52Z

Thanks @Lukasa for writing this. I think this is the missing motivation. I don't think the real motivation are the savings:

P50 saves 446ns from a 773ns operation, p90 saves 474ns from a 908ns operation, and p99 saves 563ns from a 1.1us operation.

That's nice and exactly in line with the prediction of 59ns just for allocs which we upped by 10x (to account for worse cases and the other overhead like initialising the classes in those allocs) to 500ns. Of course that's fantastic to save this but without a use case where that makes even a 1% difference I didn't (and don't) think it's worth to invent new API today before that use case actually is there.

But your post does give the real reason (which is much much better than the performance): The current API is weird and not as performant as it could be. I still think a real world use case motivating doing the change now would be nice (as opposed to when the use case is there which would provide more information). But I think your reasoning is good. My feedback to @simonjbeaumont is then: Copy Cory's post as the motivation (weird API) with the nice side effect: Your programs may also run a tiny bit faster if you schedule loads of timers.

But as you know very well, there is no difference between a delegate and a pair of closures, so we could take the pair of closures instead. The downside is that this API gets very close to obviating ScheduledTask, and it gets quite hard for us to tell users when they should use one or the other. I honestly thought the better thing was to provide a more differentiated API, not a less differentiated one.

I would've thought making it more aligned is better as it makes it easier for people to migrate from weird API to better API but I can see different arguments here.

Last question: Do we think we need to keep the old API (apart for SemVer reasons) or can we make the new API a complete replacement?

simonjbeaumont · 2024-07-30T15:19:31Z

OK, @weissi; I updated the PR description to emphasise the API clarity and demote the performance win as a nice side-effect for the use cases for which it would matter.

weissi · 2024-07-30T15:51:41Z

OK, @weissi; I updated the PR description to emphasise the API clarity and demote the performance win as a nice side-effect for the use cases for which it would matter.

Awesome, that makes much more sense to me (maybe adjust the title too).

simonjbeaumont · 2024-08-14T08:16:21Z

Looks like @weissi is happy now with the motivation (not sure if that included a code review), and @glbrntt has approved.

@Lukasa, @FranzBusch: this looks like it's waiting for a re-review from one/both of you.

FranzBusch · 2024-08-15T09:20:38Z

Benchmarks/Benchmarks/NIOPosixBenchmarks/Benchmarks.swift

+    Benchmark(
+        "MTELG.scheduleTask(in:_:)",
+        configuration: Benchmark.Configuration(
+            metrics: [.mallocCountTotal, .instructions],


Can we use our defaultMetrics here? We don't have instruction based benchmarks yet

ISTR using .instructions because you asked me too, in place of wall clock time. I can use defaultMetrics though, sure.

FranzBusch · 2024-08-15T09:37:14Z

Benchmarks/Benchmarks/NIOPosixBenchmarks/Benchmarks.swift

+        "MTELG.scheduleTask(in:_:)",
+        configuration: Benchmark.Configuration(
+            metrics: [.mallocCountTotal, .instructions],
+            scalingFactor: .kilo


Can we align the maximum duration/iterations with #2839

Can I gently push back on that ask.

You're asking me to align with a new precedent in an un-merged PR that was opened well after this one, instead of this PR keeping the conventions of the branch it is targeting.

This PR has been subject to a lot of scope creep already.

I see it's been merged already. So I've now merged this PR with main and updated the benchmarks to use the same configuration as the rest.

FranzBusch · 2024-08-15T09:39:03Z

Sources/NIOCore/NIOScheduledCallback.swift

+    /// implicitly, if it was still pending when the event loop was shut down.
+    ///
+    /// - Parameter eventLoop: The event loop on which the callback was scheduled.
+    func onCancelScheduledCallback(eventLoop: some EventLoop)


Ultra naming nit: Should this be didCancelScheduledCallback? This follows how Swift APIs are often called when something did happen. on is more used when it is about to happen or in the process of happening. e.g. viewDidLoad in UIKit vs onAppear in SwiftUI

I think that on is also used for when something has happened but is more ambiguous about whether it's after. IIRC it's will that implies before. E.g. willSet and didSet.

A quick git-grep in the NIO repo shows a use of both on and did quite a bit.

However, it's fair that the only public API uses did:

Sources/NIOCore/AsyncSequences/NIOAsyncSequenceProducerStrategies.swift:37: public mutating func didYield(bufferDepth: Int) -> Bool { Sources/NIOCore/AsyncSequences/NIOAsyncSequenceProducerStrategies.swift:42: public mutating func didConsume(bufferDepth: Int) -> Bool { Sources/NIOCore/AsyncSequences/NIOAsyncWriter.swift:70: public func didYield(_ element: Element) { Sources/NIOPosix/PendingDatagramWritesManager.swift:249: public mutating func didWrite( Sources/NIOPosix/PendingWritesManager.swift:201: public mutating func didWrite(

I can change to using did 😅

Sources/NIOCore/NIOScheduledCallback.swift

FranzBusch · 2024-08-15T09:39:59Z

Sources/NIOCore/NIOScheduledCallback.swift

+    ///
+    /// - NOTE: This property is for event loop implementors only.
+    @inlinable
+    public var customCallbackID: UInt64? {


NIT: Do we need the custom here in the naming?

IMO it adds value when glancing at it as the property name implies that it's only relevant for custom implementations. How strongly do you feel about it. It's public API so if there's a consensus that this needs a different name I'll suck it up 😄

FranzBusch · 2024-08-15T09:48:58Z

Sources/NIOPosix/SelectableEventLoop.swift

+        at deadline: NIODeadline,
+        handler: some NIOScheduledCallbackHandler
+    ) throws -> NIOScheduledCallback {
+        let taskID = self.scheduledTaskCounter.loadThenWrappingIncrement(ordering: .relaxed)


Something for the future potentially. With this new fast implementation to schedule callbacks it might be worth looking at the normal scheduleTask APIs again to see if we can revamp their internal implementation to avoid some of the allocations that they currently have. We can't get rid of all of them i.e. the returned ELP/ELF but potentially the internal allocs where we close over state.

Sure, I think @Lukasa's tome comment here will provide good content for someone wanting to see if we can improve it: #2759 (comment).

But as to whether we should, it sounded like the back and forth on the motivation was that the older APIs are a little confusing to hold. Without wanting to open another can of worms, I guess a bigger question is: are there real use cases for scheduleTask that cannot be implemented with scheduleCallback? If the answer is, no, then should we consider deprecating the old one given the discussion about its confusing shape?

simonjbeaumont

@FranzBusch addressed your feedback, thanks!

simonjbeaumont · 2024-08-15T13:41:09Z

Benchmarks/Benchmarks/NIOPosixBenchmarks/Benchmarks.swift

+        "MTELG.scheduleTask(in:_:)",
+        configuration: Benchmark.Configuration(
+            metrics: [.mallocCountTotal, .instructions],
+            scalingFactor: .kilo


I see it's been merged already. So I've now merged this PR with main and updated the benchmarks to use the same configuration as the rest.

simonjbeaumont · 2024-08-15T13:41:21Z

Benchmarks/Benchmarks/NIOPosixBenchmarks/Benchmarks.swift

+    Benchmark(
+        "MTELG.scheduleTask(in:_:)",
+        configuration: Benchmark.Configuration(
+            metrics: [.mallocCountTotal, .instructions],


simonjbeaumont · 2024-08-15T13:42:23Z

Benchmarks/Benchmarks/NIOPosixBenchmarks/Benchmarks.swift

+        let group = MultiThreadedEventLoopGroup(numberOfThreads: 1)
+        defer { try! group.syncShutdownGracefully() }


While I still disagree with this feedback and think we should change the other benchmarks too for local reasoning reasons, I'm more interested in converging this PR, so I've updated to accommodate this feedback.

simonjbeaumont · 2024-08-15T13:52:11Z

Sources/NIOCore/NIOScheduledCallback.swift

+    /// implicitly, if it was still pending when the event loop was shut down.
+    ///
+    /// - Parameter eventLoop: The event loop on which the callback was scheduled.
+    func onCancelScheduledCallback(eventLoop: some EventLoop)


simonjbeaumont · 2024-08-29T10:39:45Z

Gentle _ping. 🥺

FranzBusch

This LGTM me now. I left one perf suggestion which I think should be noticeable on a micro level. If @Lukasa is also happy with I am happy to merge this

FranzBusch · 2024-09-02T11:39:43Z

Sources/NIOPosix/MultiThreadedEventLoopGroup.swift

+    @usableFromInline
+    enum Kind {
+        case task(task: () -> Void, failFn: (Error) -> Void)
+        case callback(any NIOScheduledCallbackHandler)


One performance thought here. Currently we are storing an existential callback handler. However, we could just store the two closures itself which we can get while we are in the generic method. This way we would avoid calling through an existential on every scheduled callback task.

The requirements for protocol NIOScheduledCallbackHandler are generic functions. Is there a way I can store these as closures in the enum associated value, without making the Kind generic?

I forgot to add the second half of this. Yes this is why I think we should go back to any EventLoop. I assume the perf benefit outweighs this. This is an assumption but the scheduling and running of tasks is probably hotter than whatever we do in their callback with the passed EL. @Lukasa WDYT?

The functions being generic don't really matter here: we can store generic closures in the Kind type without promoting the generic to the Kind type itself. We won't get specialization, but that's fine.

So TL;DR: yes, changing ScheduledEventLoop's representation to a pair of closures is probably the right thing to do.

simonjbeaumont added 5 commits June 28, 2024 15:21

benchmarks: Add benchmark for MTELG.scheduleTask(in:_:)

5f7065f

api: Add NIOTimer, NIOTimerHandler, and EventLoop.setTimer(for:_:)

33e82e0

benchmarks: Add benchmark for MTELG.setTimer(for:_:)

b59e31d

internal: Add NIOCustomTimerImplementation conformance to SelectableE…

17cf105

…ventLoop

test: Add Linux pre-5.9.2 backport for fulfillment(of:timeout:enforce…

0afb3a2

…Order:)

test: Increase timer used in shutdown test

df3cdb1

glbrntt reviewed Jul 3, 2024

View reviewed changes

Sources/NIOCore/EventLoop.swift Outdated Show resolved Hide resolved

Sources/NIOCore/NIOTimer.swift Outdated Show resolved Hide resolved

Sources/NIOCore/EventLoop.swift Outdated Show resolved Hide resolved

simonjbeaumont commented Jul 4, 2024

View reviewed changes

FranzBusch requested changes Jul 4, 2024

View reviewed changes

Lukasa reviewed Jul 4, 2024

View reviewed changes

simonjbeaumont added 12 commits July 8, 2024 09:13

feedback: Make NIOTimer Sendable

b0d3f66

feedback: Rename timerFired(loop:) to timerFired(eventLoop:)

907c087

feedback(attempted): Store a closure instead of UInt64

b3e4903

feedback(reverted, allocates): Store a closure instead of UInt64

fb5e835

This reverts commit 0d9a6f9d6bb4add42e02127daddd01e00d0e6b6d.

feedback(unsure): Replace extra protocol with runtime checks

89c57ec

feedback(unsure): Generic timerFired protocol witness

59a04de

feedback(unsure): Make setTimer generic over the handler

06a4ce7

feedback: Use labelled parameter for handler

950db0c

feedback: Use separate prepositions for TimeAmount and NIODeadline APIs

d6ae472

Remove DocC disambiguation for now until API is decided

4439ac6

feedback: Add documentation to NIOTimerHandler.timerFired protocol re…

abd4b28

…quirement

feedback: Change API terms from setTimer to scheduleCallback

ceabc7b

simonjbeaumont commented Jul 8, 2024

View reviewed changes

glbrntt requested changes Jul 9, 2024

View reviewed changes

simonjbeaumont added 3 commits July 10, 2024 08:26

feedback: Local variable rename: taskId -> taskID

80d91f6

feedback: Reanme NIOScheduledCallbackHandler.onSchedule to handleSche…

2ec5543

…duledCallback

feedback: Update protocol requirement documentation comments to use D…

5d6ac17

…ocC refs

simonjbeaumont commented Jul 10, 2024

View reviewed changes

glbrntt reviewed Jul 11, 2024

View reviewed changes

format: Update for new format and lint rules

81cc415

simonjbeaumont requested a review from Lukasa July 25, 2024 15:21

simonjbeaumont changed the title ~~Add EventLoop APIs for cheaper setting of timers~~ Add EventLoop APIs for cheap scheduling of callbacks Jul 25, 2024

simonjbeaumont added 2 commits July 30, 2024 16:08

Merge remote-tracking branch 'upstream/main' into sb/timer-api

37ebfde

Remove use of Task.sleep(for:) in tests

4467728

simonjbeaumont force-pushed the sb/timer-api branch from f34f9ef to 4467728 Compare July 30, 2024 15:20

simonjbeaumont changed the title ~~Add EventLoop APIs for cheap scheduling of callbacks~~ Add EventLoop APIs for simpler scheduling of callbacks Jul 30, 2024

FranzBusch reviewed Aug 15, 2024

View reviewed changes

simonjbeaumont added 3 commits August 15, 2024 14:33

Merge remote-tracking branch 'upstream/main' into sb/timer-api

6544461

Update benchmark to use same config as other benchmarks

c48b7aa

Rename onCancelScheduledCallback to didCancelScheduledCallback

151c2c8

simonjbeaumont commented Aug 15, 2024

View reviewed changes

simonjbeaumont requested a review from FranzBusch August 15, 2024 15:18

simonjbeaumont added the semver/minor Adds new public API. label Aug 15, 2024

FranzBusch approved these changes Sep 2, 2024

View reviewed changes

		let group = MultiThreadedEventLoopGroup(numberOfThreads: 1)
		defer { try! group.syncShutdownGracefully() }

Add EventLoop APIs for simpler scheduling of callbacks #2759

Are you sure you want to change the base?

Add EventLoop APIs for simpler scheduling of callbacks #2759

Conversation

simonjbeaumont commented Jun 28, 2024 • edited Loading

Motivation

Modifications

Result

simonjbeaumont commented Jul 1, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

simonjbeaumont left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

glbrntt left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

simonjbeaumont left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

glbrntt left a comment

Choose a reason for hiding this comment

simonjbeaumont commented Jul 25, 2024

weissi commented Jul 26, 2024

Motivation

simonjbeaumont commented Jul 26, 2024

Motivation

FranzBusch commented Jul 26, 2024

Lukasa commented Jul 26, 2024 • edited Loading

weissi commented Jul 26, 2024 • edited Loading

Lukasa commented Jul 26, 2024

Lukasa commented Jul 26, 2024

weissi commented Jul 26, 2024 • edited Loading

Lukasa commented Jul 26, 2024

weissi commented Jul 29, 2024

simonjbeaumont commented Jul 30, 2024

weissi commented Jul 30, 2024

simonjbeaumont commented Aug 14, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

simonjbeaumont left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

simonjbeaumont commented Aug 29, 2024

FranzBusch left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

simonjbeaumont commented Jun 28, 2024 •

edited

Loading

Lukasa commented Jul 26, 2024 •

edited

Loading

weissi commented Jul 26, 2024 •

edited

Loading

weissi commented Jul 26, 2024 •

edited

Loading