Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add queries synchronization feature #4

Closed
wants to merge 2 commits into from

Conversation

pklsk
Copy link

@pklsk pklsk commented Jun 29, 2023

  • added automatic sync between queries;
  • added function for manual cache update

Queries now sync data unless the "normalize" option is set to false in the config or query meta.

Since data propagation occurs by checking the "manual" property in the event action properties, manually setting the cache via the QueryClient will not propagate the data change to other queries. In order to manually setting the cache and propagate the changes, the function of setting the cache has been added.

- added synchronize option to config;
- added synchronize option to query meta
- added function for manual cache update

Queries now sync data unless the synchronize option is set to false in the config or query meta.
Since data propagation occurs by checking the "manual" property in the event action properties, manually setting the cache via the QueryClient will not propagate the data change to other queries. In order to manually setting the cache and propagate the changes, the function of setting the cache has been added.
@pklsk pklsk closed this Jun 29, 2023
@pklsk pklsk deleted the feature_queries_synchronization branch June 29, 2023 20:12
@pklsk pklsk restored the feature_queries_synchronization branch June 29, 2023 20:13
@pklsk pklsk reopened this Jun 29, 2023
@klis87
Copy link
Owner

klis87 commented Jun 29, 2023

Thanks for the PR. Before reviewing, I would actually ask first, what do you mean by added automatic sync between queries? Perhaps that a query response can update your other existing queries because new query could have more fresh data?

@klis87
Copy link
Owner

klis87 commented Jun 29, 2023

Could you please provide an example (from app perspective, with a use case which would require this new feature?). I do not mean a real full example, just some code snippet would be very helpful.

By the way, if this new manual method is needed now, I think related PR would be beneficial - #2

Actually above should contain also some provider, so that then we could have a hook, which could read normalized store from React context. So you could get the normalized store wherever in your app, and it would be then easy to use this new method anywhere.

@pklsk
Copy link
Author

pklsk commented Jun 29, 2023

Hello, thanks for your attention.

Perhaps that a query response can update your other existing queries because new query could have more fresh data?

Yes, that's what it does
Here is a small sample project. (it is just generated with vite, please see app.ts file)
It has two queries and two views, as well as a server simulator. To see how it works, click on the "simulateServerProcess" button and then on the "update" button on any of the views. Both views will receive updated data.
I think this is reasonable, because without this feature, both views will still be updated after the mutation, so we achieve predictable behavior because if the "normalize" parameter is enabled, the queries will always be synchronized, regardless of whether new data has arrived from another query or from the response to the mutation.

By the way, if this new manual method is needed now, I think related PR would be beneficial

In fact, it is needed simply because query response synchronization checks to see if the cache has been set manually to avoid a stack overflow. (look at line 78 in the modified file) yes, it's a hack :)
In this case, "queryClient.setQueryData" will work as it would without your library. So I decided to add a wrapper over this method that would distribute the data to all queries.
In fact, I'm not sure if anyone really needs it. Perhaps the "setQueryData" method should really be moved to another PR, or simply removed.

Actually above should contain also some provider, so that then we could have a hook, which could read normalized store from React context. So you could get the normalized store wherever in your app, and it would be then easy to use this new method anywhere.

Yes, I think that it would be useful, in any case, not only on this PR. In my projects, I use the provider, and the context for normalized cache, but it seems to me that this is outside the scope of this PR.

normy_example.zip

@klis87
Copy link
Owner

klis87 commented Jun 30, 2023

Hmm, this is actually the problem I already considered and decided no to implement for a reason. Queries should be read-only and typically they should not change data. I understand that you mean a case in which a query could have old data for some reason, and another query would update it, almost like automatic invalidation of dependent queries, but imho the key is, that this would be done by accident. This is the contract to mutations, which we know that they can update data.

In my opinion, query synchronization is like giving us a false security that all data is up to date. Instead, maybe we should think, why a data can be not fresh, maybe we should have subscriptions, some short polling instead?

So, I think that main disadvantage of this is:

  1. it would hurt performance, each query can have big data, often much bigger than mutation, it could fetch many objects, and potentially update many other queries, even worse with react-query invalidation on focus, you could have like 10 queries updated, imagine they depend on each other, it would update each query 10 times even though all would be invalidated and fresh
  2. the false security I mentioned above, unpredictable query data updates.

Taking those cons into discussion, could you think about some good examples where this would be in fact beneficial to have?

@pklsk
Copy link
Author

pklsk commented Jun 30, 2023

Before I reply to your previous comment, could you tell me what you think about the "Apollo client"? Just to better understand your opinion on some things.

In any case, some of what you wrote made me think, maybe it will take some time to think.

@klis87
Copy link
Owner

klis87 commented Jun 30, 2023

Apollo client was actually the library which made me done https://github.com/klis87/redux-requests . I did not have much time recently though to work on it, so I took the most important thing for me - normalization out of it - to be able to use this feature with libs like react-query - this is genesis of normy :)

So I like Apollo but I do not understand it was the 1st lib like it created only for graphql, as those concepts, as other libs showed, can be applied to REST and other APIs.

Regarding my opinion, I remember I was crazy about its normalization feature, but I sometimes had some issues with it, like data did not update, or had some weird not debuggable bugs, it was also hard to update data manually. Now I prefer libs like react-query, because they achieve simililar things in cleaner way, plus you can use it no matter you use graphql or not.

@pklsk
Copy link
Author

pklsk commented Jun 30, 2023

So actually I have a lot of thoughts on this subject. Even outside the scope of the PR, I would be interested in your opinion on what I will write next.

There is a lot of text, so if this discussion is boring for you, please say so.

Queries should be read-only and typically they should not change data.

This is exactly the point that has been causing me cognitive dissonance for a long time. Conceptually, this is true, and this is exactly why I think it is not worth implementing the functionality from this question Implement automatic data updates for adding a node to arrays and removing from arrays #1.
I think that what should be in the list (and they can be built according to a variety of criteria) should be in the area of responsibility of a particular query. For example, whether an object created with mutation gets into the current list is determined directly on the server in the depth of the business logic. So in that case, what the mutation is supposed to do is just invalidate queries that we think might be outdated by the mutation, because we'll never really know for sure, or we'll have to duplicate all the business logic.

I understand that you mean a case in which a query could have old data for some reason, and another query would update it, almost like automatic invalidation of dependent queries, but imho the key is, that this would be done by accident. This is the contract to mutations, which we know that they can update data.

In fact, conceptually, when we mutate an object, the same thing happens as in the example with the creation of a new object. That is, conceptually the cleanest way would be to always just invalidate queries and never update the cache in response to mutations.
However, this gives rise to a number of problems.
First, as a rule, in response to a mutation, the server sends us actual data for a specific object. And while we can't know which lists the new object should be in, or which lists the deleted object should be removed from, at least we can know that the particular object we're tracking somehow has changed some of its data. And they are already at our disposal.
Therefore, it seems to me, conceptually, it would be worthwhile to separate the validity of the query and the validity of the cache. What we do when we manually update the cache in response to mutations or optimistic mutation updates is actually updating the cache, not the query that references that cache. Ideally, as I said above, we would simply invalidate everything that tracks the data affected by the mutation. After all, optimistic updates and updates in response to a mutation are essentially a hack, against the concept of read-only queries, which we use only due to technical limitations (traffic savings, connection availability and speed, and as a result, interface responsiveness).

So when we use normy, I would treat it like this:

This is the contract to mutations, which we know that they can update data synchronize cache state for all queries.

Because they do it automatically, we have no control over which queries are affected by the changes. In this case, it seems to me that the contract happens at the moment "createQueryNormalizer(queryClient)" is called and says all queries will be normalized and the data of the normalized objects will be synchronized anywhere. We can adjust the details of this contract in the configuration when we set the "nomalize" flag for all requests and refuse this contract when we add the "nomalize" flag to the meta - false, or vice versa. That is why it seems to me that fresh queries should update the data in existing queries.

the false security I mentioned above, unpredictable query data updates.

As a result, in my opinion, the contract should be that all data under the "normalize" flag do not conflict with each other and are synchronized. This contract doesn't say "all data is up to date", it's React Query responsibility.

I think it's just worth separating queries that sign normalization contracts and queries that exist within React Query contracts in terms of cache management.

it would hurt performance, each query can have big data, often much bigger than mutation, it could fetch many objects, and potentially update many other queries, even worse with react-query invalidation on focus, you could have like 10 queries updated, imagine they depend on each other, it would update each query 10 times even though all would be invalidated and fresh

Yes, this is indeed a question worth thinking about.

Thanks for the answer about Apollo. I see that we have similar thoughts about this. And given that you make all these great libraries for cache normalization, it seems to me that we have similar views on this problem, or at least we really see the same problem.

In any case, first of all, in this PR it is worth starting from the direction in which you see the development of the library. Since I realized that it is possible, this PR breaks her concept (what I wrote about the contract)

P. S. All of the above is IMHO

@pklsk
Copy link
Author

pklsk commented Jun 30, 2023

Perhaps a good compromise between automation and control would be to add something like a "propagate" property to the query and mutation properties or their meta, in which we specify the keys that should be affected. Like invalidation, but instead of re-request - cache update.
EDIT: something similar to invalidatesTags in rtk query

@klis87
Copy link
Owner

klis87 commented Jul 1, 2023

There is a lot of text, so if this discussion is boring for you, please say so.

Not at all, this is what make projects better!

This is exactly the point that has been causing me cognitive dissonance for a long time. Conceptually, this is true, and this is exactly why I think it is not worth implementing the functionality from this question #1.

For me it is kind of logical, because the mentioned issue about updating arrays is about doing this as a result of mutations, and this is the key. Mutation can update data, while query cannot. So I do not see any contradiction here. The reason for this task is, that instead of doing updates manually (boring) or invalidation of affected queries (boring, low performance and the worst - you need to really know which queries to invalidate, which is hard to maintain). Imagine a hard mutation like DELETE /book/1, which should make it removed from all queries. With hints like { id: '1', __removed: true }, I could just update all queries to remove this book from each list. There are more complex cases but I hope you can get the idea that this declarative way of updating data could be very convenient.

In fact, conceptually, when we mutate an object, the same thing happens as in the example with the creation of a new object. That is, conceptually the cleanest way would be to always just invalidate queries and never update the cache in response to mutations.

Removing performance from the equation, another big issue with invalidation is that you need to know which queries to invalidate. Even worse, what if you have some queries with server side paginations, filters and so on, you would need to know exactly which queries to invalidate, this is where normalization updates come into play, you do not need to know which queries are affected, you think only from object perspective.

something similar to invalidatesTags in rtk query

This is kind of similar problem I have with invalidation - you need to know which tag to invalidate, you need to manage grouping.


I hope I did not omit anything from your response, but to sum up, I do not believe that automatic query synchronization should land in the lib, for example due to performance reasons, but I do not want to prevent you from having queries synchronization feature either.

So, I would like to suggest you an alternative, which would actually do if I am not mistaken exactly what you need. What about a new method of queryNormalizer, let's call it setNormalizedData, or setLocalMutation, which would work like this:

queryNormalizer.setNormalizedData({ id: '1', title: 'Updated title' })

This would just do the same what would happen after response for normalized mutation - all queries with book with id: 1 would be recalculated and updated.

It would give max flexibilty, for example imagine you get { id: '1', title: 'Updated title' } from websocket server, now you can use this method to update all necessary queries.

Also, to allow you to sync queries, I would bring an option to blacklist queries, as it does not make sense to update the query which already was updated due to response:

useBooksQuery({
  onSuccess: (data) => {
    queryNormalizer.setNormalizedData(data, { omitQueries: ['books'] })
  }
})

Do you think it would be OK for use cases?

@klis87 klis87 mentioned this pull request Jul 2, 2023
@pklsk
Copy link
Author

pklsk commented Jul 5, 2023

I'm sorry that I disappeared for a long time, there was a lot of work.

I think the "setNormalizedData" method would be a nice extension to the queryNormalizer API in general. Something like this is definitely worth adding.

Within the scope of this PR, I see, as I expected, my proposal does not correspond to the concept of your library. So this can be closed.

Thanks for the discussion!

@klis87
Copy link
Owner

klis87 commented Jul 6, 2023

If you wish, the above feature could be implemented within this PR, as it hopefully addresses the topic, but in another way. Or, I could just create a new PR with a reference to yours. But the most important thing in this topic is, in your opinion setNormalizedData indeed would allow the cases we discussed?

@klis87
Copy link
Owner

klis87 commented Oct 29, 2023

@kalaskou I added the feature we discussed - https://github.com/klis87/normy/releases/tag/%40normy%2Freact-query%400.10.0

So I am closing this PR, feel free to ask any question is something is unclear or if this is not sufficient for you case.

@klis87
Copy link
Owner

klis87 commented Dec 22, 2023

Hi @pklsk

Please see #19 (comment) . There was a similar vote to yours. Because recently I added https://github.com/klis87/normy/releases/tag/%40normy%2Fcore%400.9.0 optimization, we could reconsider adding this feature, as with this optimization performance would be still good. What do you think?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants