Message hash based query #662

kaichaosun · 2024-01-26T04:46:10Z

By adding this fine-grained message query protocol, clients could check if messages are stored in specific nodes and further used by the sync protocol.

TODOs,

Do we need to wrap it in QueryMessagesRPC just like store and lightpush protocol? Metadata protocol is not using this wrap, so curious to know why the wrap exists.
Should we combine this protocol to store protocol with an upgrade? (Split it here, because I think it can be used without depend on store history query protocol).
Decide the max length of digests

chaitanyaprem · 2024-01-26T08:43:22Z

Few comments:

Why to have a separate rfc for this? I think it would be better if this is part of Store protocol itself, because this is just another type of query than the current HistoryQuery.
There are 2 use-cases which i can think of that would be useful. Hence i propose a new flag in the request which indicates whether to return the messagePayload or only indicate if message is present. Response can also be changed to indicate presence of a hash and/or payload.
2a. A node just wants to check if messageHashses are present or available with Store node, which i think is the reason why you had created this rfc
2b. A node wants to fetch specific messageHashses from a StoreNode because it knows it is missing those hases. This could be identified by some application level protocol or by StoreSync protocol between store nodes.

jm-clius

Thanks for delving into this problem domain. However, this functionality is being added by the Research team as part of a general overhaul of the Store protocol and will be an update on that RFC (https://rfc.vac.dev/spec/13/)

kaichaosun · 2024-01-26T10:32:26Z

Why to have a separate rfc for this?

this functionality is being added by the Research team as part of a general overhaul of the Store protocol and will be an update on that RFC (https://rfc.vac.dev/spec/13/)

@chaitanyaprem @jm-clius

To me store protocol is pretty unclear about its scope,
Things like how SQLite and Postgres is used in different scenarios is already worth to extract for its own spec.

sqlite works locally and strict about what to stored
postgres works on server side, tend to be federated topology, and share between communities/topics

Such practices is implicit in waku implementation. So I think maybe split spec for its single purpose could do better to waku protocol. Have more specs is definitely not the best solution, but with best practices learns, specs could merge at some point. In another words, current store protocol could be splited into at least 4 specs,

what to store, i.e. the schema of the store, and extensible/pluggable storage like relational database, kv database, graph database. (@chaitanyaprem also mentions use mongodb for messages storage, I think by extracting this part, it will help a lot for such requirements)
how to choose underlining storage provider for relational database, if go with sqlite, user expect it's on a client side and meets more restrictions, if go with postgres, user expect it can help with different topics/communities
history query wire protocol
message hash based query wire protocol

Hence i propose a new flag in the request which indicates whether to return the messagePayload or only indicate if message is present.

In p2p network, we should not trust the response from any 3rd party nodes before verify the integrity of the response. Verify the digest by recalculating the hash of the response is what we can do here.

This could be identified by some application level protocol or by StoreSync protocol between store nodes.

StoreSync is great, but this likely requires more iterations, I think this spec can help with the iterations.
Another thought is store protocol is currently a federated solution, I'd like to see if client could play this store role (with restricted resources) and remove the dependency to a high availability store nodes.

kaichaosun · 2024-01-26T10:57:43Z

Split store protocol could also help mitigate such compatibility issue: waku-org/go-waku#965.

chaitanyaprem · 2024-01-26T11:02:24Z

To me store protocol is pretty unclear about its scope, Things like how SQLite and Postgres is used in different scenarios is already worth to extract for its own spec.
* sqlite works locally and strict about what to stored

* postgres works on server side, tend to be federated topology, and share between communities/topics

If you read the first line in rfc it is pretty clear what is the scope of Store protocol.
the store protocol doesn't indicate what to store. The name is little misleading, it is more of a query protocol to query data from a node that Stores data. That is all what is specified in the rfc.
Since this is also like a query to a node a that is storing data, it makes more sense to include this in Store protocol itself as it falls under querying Stored messages.

chaitanyaprem · 2024-01-26T11:05:49Z

In p2p network, we should not trust the response from any 3rd party nodes before verify the integrity of the response. Verify the digest by recalculating the hash of the response is what we can do here.

I agree your point wrt integrity of data(which can just be locally calculated on requesting node from the data received to verify). But this is not related to the point what i was trying to highlight. I was just trying to indicate that this new query based on messageHashes can include 2 types of queries

A query where the requesting Node just wants to know if StoreNode has a specific hash or not
A query where requesting Node can fetch message by hash

kaichaosun · 2024-01-26T15:16:26Z

Thanks @chaitanyaprem for the detail info, it helps a lot.

Since this is also like a query to a node a that is storing data, it makes more sense to include this in Store protocol itself as it falls under querying Stored messages.

I'm not against to put history query and hash based query into same spec. Not doing that for now is, I don't know what's the best way to iterate spec/13. Making it split is the easy way I can think of to help with this feature.
And iterate this thin spec (either remove or merge into others) is pretty easy once the upgraded version spec/13 is clear.

More context, when looking at waku research milestone Epic SU1: Store v3-beta (message hashes), I think this is a good start point for me explorer the process with spec changes. By spliting this spec (spec/67) out of spec/13 is the strategy that I use to tackle the complexity within spec/13 upgrades and cross team collaboration.

A query where the requesting Node just wants to know if StoreNode has a specific hash or not

The response node returns a message whose hash result is same with the one requested, is probably the only way for request node to trust the response node has this message. If the response node is trusted like the message receiver, receiver node has a specific hash or not actually makes sense. Store node seems not in the same trust model.

ABresting · 2024-01-31T21:11:23Z

Not doing that for now is, I don't know what's the best way to iterate spec/13. Making it split is the easy way I can think of to help with this feature.

I echo with @chaitanyaprem on the no need for a split here. imo, Sync protocol completes Store protocol, since while Sync needs to figure out which hashes are missing --> Store needs to help deliver Waku messages corresponding to those hashes.

kaichaosun · 2024-02-05T04:27:44Z

Close in favor of #665

Message hash based query

a0e54c9

kaichaosun requested review from richard-ramos, plopezlpz, ABresting, chaitanyaprem and jm-clius January 26, 2024 04:47

max length of digests.

fc07e08

jm-clius reviewed Jan 26, 2024

View reviewed changes

kaichaosun closed this Feb 5, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Message hash based query #662

Message hash based query #662

kaichaosun commented Jan 26, 2024 •

edited

Loading

chaitanyaprem commented Jan 26, 2024 •

edited

Loading

jm-clius left a comment

kaichaosun commented Jan 26, 2024

kaichaosun commented Jan 26, 2024 •

edited

Loading

chaitanyaprem commented Jan 26, 2024

chaitanyaprem commented Jan 26, 2024

kaichaosun commented Jan 26, 2024 •

edited

Loading

ABresting commented Jan 31, 2024

kaichaosun commented Feb 5, 2024

Message hash based query #662

Message hash based query #662

Conversation

kaichaosun commented Jan 26, 2024 • edited Loading

chaitanyaprem commented Jan 26, 2024 • edited Loading

jm-clius left a comment

Choose a reason for hiding this comment

kaichaosun commented Jan 26, 2024

kaichaosun commented Jan 26, 2024 • edited Loading

chaitanyaprem commented Jan 26, 2024

chaitanyaprem commented Jan 26, 2024

kaichaosun commented Jan 26, 2024 • edited Loading

ABresting commented Jan 31, 2024

kaichaosun commented Feb 5, 2024

kaichaosun commented Jan 26, 2024 •

edited

Loading

chaitanyaprem commented Jan 26, 2024 •

edited

Loading

kaichaosun commented Jan 26, 2024 •

edited

Loading

kaichaosun commented Jan 26, 2024 •

edited

Loading