Add finalizer support in async API #1173

zuiderkwast · 2023-03-12T15:52:42Z

This is another take on adding finalizer to the async API. Some things are taken from #926 but the implementation is completely rewritten and includes a rewrite of pubsub handling and adds reference counting for callbacks in the async API.

API changes

The redisAsyncCommand() family of functions have got new variants with "WithFinalizer" suffixes. The new functions are redisvAsyncCommandWithFinalizer(), redisAsyncCommandWithFinalizer(), redisAsyncCommandArgvWithFinalizer() and redisAsyncFormattedCommandWithFinalizer().
For each call to redisAsyncCommandWithFinalizer or one of its variants, the finalizer will be called exactly once.
The finalizer is called when a callback (even if the callback function is NULL) is no longer associated with any command or pubsub channel or pattern.
The finalizer is called after the last call to the callback. The callback is never called after the finalizer has been called.
Support the RESET command: It cancels subscriptions and MONITOR mode.
Accept commands while in MONITOR mode.
Add support for sharded pubsub (SSUBSCRIBE, SUNSUBSCRIBE).

Internal changes

The callback queue is changed so that each callback is allocated once and freed once. The operations "push" and "shift" don't copy the callback struct between heap and stack anymore, but instead they and just set the 'next' pointer to link callbacks into a list, without doing any allocations. For pubsub channels, one callback can be mapped to multiple channels, and they can be unsubscribed independently, so we use a reference counter to keep track of references to a callback. This makes adding finalizer support, and making sure it is called only once when the callback is freed, strait-forward.

The way hiredis kept tack of 'pubsub mode' using pending_subs and unsubscribe_sent was flawed. It couldn't handle multiple pending subscribe commands with an overlapping but different set of channels and it couldn't handle an error reply to a SUBSCRIBE or UNSUBSCRIBE command. Now we change this so that all commands, even SUBSCRIBE and UNSUBSCRIBE, are added to the reply queue and the 'subscribe' and 'unsubscribe' replies are matched against them in the queue. The channel-to-callback dicts are updated when a 'subscribe' or 'unsubscribe' reply is received, confirming that the client has subscribed or unsubscribed to a channel, rather than when the command is sent. This change makes it possible to handle an error reply for a pubsub command.

With this change, there is no need to keep a different replies queue for subscribe commands and regular commands. All are added to the same replies queue.

The callback queue in the changed so that each callback is allocated once and freed once. The operations "push" and "shift" don't copy the callback struct between heap and stack, but instead they and just set the 'next' pointer to link callbacks into a list, without doing any allocations. For pubsub channels, one callback can be mapped to multiple channels, and they can be unsubscribed independently, so we use a reference counter to keep track of references to a callback. This prepares for adding finalizer support and to be sure it is called only once, when the callback is freed. The way hiredis kept tack of 'pubsub mode' using `pending_subs` and `unsubscribe_sent` was flawed. It couldn't handle multiple pending subscribe commands with an overlapping but different set of channels and it couldn't handle an error reply to a SUBSCRIBE or UNSUBSCRIBE command. Now we change this so that all commands, even SUBSCRIBE and UNSUBSCRIBE, are added to the reply queue and the 'subscribe' and 'unsubscribe' replies are matched against them in the queue. The channel-to-callback dicts are updated when a 'subscribe' or 'unsubscribe' reply is received, confirming that the client has subscribed or unsubscribed to a channel, rather than when the command is sent. This change makes it possible to handle an error reply for a pubsub command. With this change, there is no need to keep a different replies queue for subscribe commands and regular commands. All are added to the same replies queue.

* For each time redisAsyncCommandWithFinalizer or one of its variants has been called with a finalizer, the finalizer will be called exactly once. * The finalizer is called when a callback (even if the callback function is NULL) is no longer associated with any command or pubsub channel or pattern. * The finalizer is called after the last call to the callback. The callback is never called after the finalizer has been called. New async API functions added: * redisvAsyncCommandWithFinalizer * redisAsyncCommandWithFinalizer * redisAsyncCommandArgvWithFinalizer * redisAsyncFormattedCommandWithFinalizer Co-Authored-By: Tudor Bosman <[email protected]>

Co-Authored-By: Tudor Bosman <[email protected]>

Additionally, accept commands in monitor mode. (For example the RESET command, but also other commands.) Apart from being useful but itself, this change makes the async API's reply queue stay in sync (mapping each reply to the callback given when the command was sent) when bombed with random commands (fuzzing).

michael-grunder · 2023-07-21T04:05:35Z

This is really good work. If it's ready I'll go through it and get it merged.

zuiderkwast · 2023-07-21T09:15:43Z

This is really good work. If it's ready I'll go through it and get it merged.

Thanks. I thought it was ready, but there is some minor issue (timeout in a test) after I added the last commit (RESET command). I hope to get it fixed today or in the next days.

zuiderkwast · 2023-07-21T15:06:13Z

@michael-grunder Did you restart the mac build? :-)

I believe this is ready now.

zuiderkwast · 2023-07-21T15:08:20Z

If you want to squash-merge, we can use the top comment as the commit message.

If not, I can squash the fixup commits into the main commits later.

kristjanvalur · 2023-07-29T15:58:42Z

Its been a long time since I looked at this code, but are the callbacks hashed (and refcounted) according to both the callback function pointer and the privdata?

zuiderkwast · 2023-07-29T17:51:12Z

are the callbacks hashed (and refcounted) according to both the callback function pointer and the privdata?

No, they are refcounted only per call to redisAsyncCommandWithFinalizer or its variants. The code doesn't check if the same privdata or function is used for other async commands.

[Edit] As mentioned in the PR description: "For each call to redisAsyncCommandWithFinalizer or one of its variants, the finalizer will be called exactly once".

This means hiredis doesn't care at all what privdata is. If the user wants to use the same privdata for multiple calls, they can put a refcounter on the privdata and decrement it in the finalizer. @kristjanvalur Do you think it is the right design or do you have another idea?

zuiderkwast · 2023-08-21T09:25:49Z

@michael-grunder I just came back from vacation. Did you have any chance to look at this yet?

The use case we have is to implement pubsub in hiredis-cluster, which is a cluster wrapper around hiredis. For this, we would use the same finalizer for all commands, so a simpler API would suffice, e.g. redisAsyncSetFinalizer(ac, finalizer) rather than adding the four variants of redisAsyncCommandWithFinalizer. The finalizer can then be stored in the async context rather than in the callback struct. WDYT?

kristjanvalur · 2023-08-21T11:21:17Z

This means hiredis doesn't care at all what privdata is. If the user wants to use the same privdata for multiple calls, they can put a refcounter on the privdata and decrement it in the finalizer. @kristjanvalur Do you think it is the right design or do you have another idea?

I haven't thought too deeply about it but I suspect the design is fine, if every call gets one call back. I remember having concerns about the previous attempt where callbacks were registered sepearately, if I remember correctly. #926

Edit: Thinking more about it, my concerns were with an API where one explicitly registers a callback, and then, whether the same CB pointer, with a different "privdata" would be considered a separate listener or not. I mean, you could use the same actual callback with different privdata to listen to different things.

This is never ambiguous with the proposed WithFinalizer API, but if you go with redisAsyncSetFinalizer then you need to specify probably that only a single finalizer can be active, or something similar.

zuiderkwast · 2023-08-22T09:40:00Z

if you go with redisAsyncSetFinalizer then you need to specify probably that only a single finalizer can be active, or something similar.

Yes, it's like a config, so only a single finalizer can be set. Setting a new one replaces the previous one. Is that what you mean?

The finalizer would be called once for each command, whenever the command's callback struct is freed internally. (For UNSUBSCRIBE, this means the finalizer is called immediately (because the replies to UNSUBSCRIBE are passed to SUBSCRIBE's callback) and even if privdata and callback are NULL, but this is fine to me too.)

I don't have a strong preference regarding *WithFinalizer vs redisAsyncSetFinalizer.

kristjanvalur · 2023-08-22T15:55:04Z

So, here is the thing:
The finalizer is intended to finalize a callback, which has been registered. A callback is a unique tuple (redisCallbackFn* cb, void *privdata). The finalizer is supposed to be called when the (*cb)(privdata) call will never be made again.

So, if you go with the withFinalizer thing, you will have to change it so that it receives both the function pointer and the privdata, i.e. te redisFinalizerCallback must be somethign like typedef void (redisfinalizerCallback*)(redisCallbackFn *cb, void *privdata)

zuiderkwast · 2023-08-22T16:21:43Z

OK, I see what you mean. But in the implementation, there is nothing that requires the tuple (callback_fn, privdata) to be unique. Even the exact same tuple can be used multiple times. I don't think the implementation should check that. It would be unnecessary complex and maybe even undesirable. Probably we should change the description to something like

"The finalizer is called when a callback (even if the callback function is NULL) is no longer associated with the command where it was provided"

instead of "... with any command or pubsub channel or pattern".

Btw, I realized now I haven't documented this feature in the README. I'll do that when there is a decision about the API.

kristjanvalur · 2023-08-23T09:00:47Z

"The finalizer is called when a callback (even if the callback function is NULL) is no longer associated with the command where it was provided"

Yes, that is clearer. I just want to make sure that the finalizer then, needs to get both the callback pointer and the privdata provided to the API method which is being finalized.

You may even want to provide the finalizer with its own privdata. That way, both the callback and finalizer can effectively be "bound methods."

typedef void (redisFinalizerCallback)(struct redisAsyncContext *ac, void *finalizer_privdata, redisCallbackFn * cb, void *cb_privdata)

You could add some blurb in the line of "for most commands, that will be right after the callback is called for the only time, but for subscriptions that happens only when the subscription completes."

zuiderkwast · 2023-08-23T09:21:20Z

I just want to make sure that the finalizer then, needs to get both the callback pointer and the privdata provided to the API method which is being finalized.

Why would you need that? Do you have a use case?

The use case so far is to free the privdata when it is no longer used. This assumes you do not reuse privdata for multiple commands (or if you do, you can use a reference counter) but I can't see why you would need the callback function pointer in the finalizer.

You may even want to provide the finalizer with its own privdata. That way, both the callback and finalizer can effectively be "bound methods."

I'd rather avoid that. I can't see why this would be useful. If you want special data for the finalizer, it can be embedded in the same privdata argument passed to the command.

You could add some blurb in the line of "for most commands, that will be right after the callback is called for the only time, but for subscriptions that happens only when the subscription completes."

Good idea. I'll add something like that.

kristjanvalur · 2023-08-23T11:35:16Z

I just want to make sure that the finalizer then, needs to get both the callback pointer and the privdata provided to the API method which is being finalized.

Why would you need that? Do you have a use case?

Sure, you call two API functions. Both take different callback pointers and different privdata. Different callbacs can have different privdata formats.

Once you get a "finalizer" call, you have no idea which one of those calls are being finalized, unless both the callback and privdata are returned. The privdata is meaningless without a corresponding function pointer to associate it with.

I think that the point I'm trying to get across here, is that we should be able to consider a (func_ptr, data_ptr) as a bound method, a pointer to a class method, and they should always be kept around and used as a pair. The data_ptr is meaningless without the corresponding func_ptr and vice versa.

Whenever, in C, a callback is provided, the callback should always be a function pointer and some anynomus arg. That way, you can easily take an C++ class instance method and wrap it into a static method + this pointer.
Each callback function registered in C should (in general for C api, not just here) always have its own private data like that. That way, the callbacks can belong to different objects and you can register object methods as callbacks everywhere. Otherwise you are forced to view all your callbacks as different static methods bound to a single "object" which puts limits on how an api is used.

That's how I would design an API to be the most general. However, you could argue that people should just declare a single "privdata" data structure and use that everywhere where they use the api. Make it a limitation of the api. But then that should probably be explicitly mentioned. I.e. if you see the finalizer solely as a "free" function for privdata, and the finalizer is not per-call, then all the "privdata" need to be of the same type. Which was previously not a requirement in the API.

For this reason, I think I prefer the "with finalizer" approach, because in the current form, it allows you to register two bound methods to the same_object for each api, the callback of the object and the finalizer of the object, without restricting you to use the same "object type" everywhere.

zuiderkwast force-pushed the finalizer branch 2 times, most recently from 1b50cd8 to 342cb92 Compare July 19, 2023 09:39

zuiderkwast and others added 2 commits July 19, 2023 12:25

Add example-libevent-pubsub

5468136

Co-Authored-By: Tudor Bosman <[email protected]>

zuiderkwast force-pushed the finalizer branch from e90ea9a to 5468136 Compare July 19, 2023 13:21

zuiderkwast changed the title ~~Async finalizer rebased (WIP)~~ Add finalizer support in async API Jul 19, 2023

zuiderkwast marked this pull request as ready for review July 19, 2023 23:17

zuiderkwast mentioned this pull request Jul 19, 2023

Add finalizer support to async Redis client #926

Open

zuiderkwast force-pushed the finalizer branch from 9169ba5 to 41cf219 Compare July 20, 2023 17:17

zuiderkwast added 4 commits July 21, 2023 14:45

Fix pointer errors in test_reset

17c391b

Test case fix: Break event loop on server close on QUIT

cbe49dd

Test fix: Disconnect properly in test_reset()

7081c21

Remove unused fields in async context

7f8a1bb

Add docs of the WithFinalizer functions in README

63968e9

zuiderkwast added 3 commits August 23, 2023 16:27

Merge remote-tracking branch 'redis/master' into finalizer

cb6fbbf

Fixup: Spelling

401053a

Fixup: Spelling again

27d36f4

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add finalizer support in async API #1173

Add finalizer support in async API #1173

zuiderkwast commented Mar 12, 2023 •

edited

Loading

michael-grunder commented Jul 21, 2023

zuiderkwast commented Jul 21, 2023

zuiderkwast commented Jul 21, 2023

zuiderkwast commented Jul 21, 2023

kristjanvalur commented Jul 29, 2023

zuiderkwast commented Jul 29, 2023 •

edited

Loading

zuiderkwast commented Aug 21, 2023

kristjanvalur commented Aug 21, 2023 •

edited

Loading

zuiderkwast commented Aug 22, 2023

kristjanvalur commented Aug 22, 2023 •

edited

Loading

zuiderkwast commented Aug 22, 2023

kristjanvalur commented Aug 23, 2023

zuiderkwast commented Aug 23, 2023

kristjanvalur commented Aug 23, 2023 •

edited

Loading

Add finalizer support in async API #1173

Are you sure you want to change the base?

Add finalizer support in async API #1173

Conversation

zuiderkwast commented Mar 12, 2023 • edited Loading

API changes

Internal changes

michael-grunder commented Jul 21, 2023

zuiderkwast commented Jul 21, 2023

zuiderkwast commented Jul 21, 2023

zuiderkwast commented Jul 21, 2023

kristjanvalur commented Jul 29, 2023

zuiderkwast commented Jul 29, 2023 • edited Loading

zuiderkwast commented Aug 21, 2023

kristjanvalur commented Aug 21, 2023 • edited Loading

zuiderkwast commented Aug 22, 2023

kristjanvalur commented Aug 22, 2023 • edited Loading

zuiderkwast commented Aug 22, 2023

kristjanvalur commented Aug 23, 2023

zuiderkwast commented Aug 23, 2023

kristjanvalur commented Aug 23, 2023 • edited Loading

zuiderkwast commented Mar 12, 2023 •

edited

Loading

zuiderkwast commented Jul 29, 2023 •

edited

Loading

kristjanvalur commented Aug 21, 2023 •

edited

Loading

kristjanvalur commented Aug 22, 2023 •

edited

Loading

kristjanvalur commented Aug 23, 2023 •

edited

Loading