-
Notifications
You must be signed in to change notification settings - Fork 413
Feature 1289 Data Caching Mechanism
Running on the edge side, it is common to encounter network connection failure. For rules which sink to external system, especially remote external system, it is important to cache the data during failures such as network disconnection and resend once reconnected.
Previously, eKuiper has some degree of support for sink caching. It provides a global configuration to switch cache; system/rule level configuration for memory cache serialization interval. However, the cache is only in memory and a copy to the db(mirror of the memory) and do not define clear resend strategy. It is actually part of the failure recovery mechanism (qos). In this proposal, the cache will be saved in both memory and disk so that the cache capacity becomes much bigger; it will also continuously detect failure recovery status and resend without restarting the rule.
The cache happens only in sink because that is the only place to send data outside eKuiper. Each sink can have its own cache mechanism configured. The flows in each sink is alike.
- Error detection: after sending failure, the sink should identify recoverable failure (network and so on) by returning a specific error type which will trigger the caching.
- Cache mechanism: memory cache store the earliest failed data.After exceeding the memory threshold, new cache store to disk. After exceeding the disk threshold, the memory cache as the earliest events will be dropped and read in the earliest page of cache in the disk.
- Resend strategy: when new data comes in (discuss how to trigger?), send the first data from the memory cache to detect network conditions. If successful, send all caches in order (mem + disk) with a defined interval.
There wil be two levels of sink cache configuration. A global configuration in etc/kuiper.yaml
to define the default behavior for all rules. And a rule sink level definition to override the default behaviors.
The configuration items:
- enableCache: whether to enable sink cache.
- memorySize: the maximum number of messages to be cached in memory. It can be considered as a "page".
- diskSize: the maximum number of messages to be cached in the disk.
- resendInterval: the interval for resending the messages after failure recovered.
- The disk storage will be sqlite. Each sink will have a sqlite table to save the cache.
- Add cached count to the sink metrics