pi revisions

cardenaso11 · cardenaso11 · commit 4389fbbf678c · 2023-11-08T03:12:08.000-04:00
diff --git a/docs/adr/2023-10-25_025-streaming-persistence.md b/docs/adr/2023-10-25_025-streaming-persistence.md
@@ -1,7 +1,7 @@
 ---
 slug: 29
 title: |
-  29. Stateless API for consuming persistence events, and ingesting transactions into persistence
+  29. EventServer abstraction for event persistence
 authors: []
 tags: []
 ---
@@ -12,6 +12,18 @@ N/A
 
 ## Context
 
+The current Hydra API is materialized as a firehose WebSocket connection: upon connecting, a client receives a deluge of information, such as all past transactions. Additionally, there is no way to pick and choose which messages to receive, and not all events are exposed via the API.
+
+This forces some awkward interactions when integrating with a Hydra node externally, such as when using it as a component in other protocols, or trying to operationalize the Hydra server for monitoring and persistence. Here's just a few of the things that prove to be awkward:
+ - An application would need to store it's own notion of progress through the event log, so it knew which messages to "ignore" on reconnection
+ - An application that wanted to write these hydra events to a message broker like RabbitMQ or Kinesis would need it's own custom integration layer
+ - An application that only cared about a limited set of message types may be overwhelmed (and slowed down) by the barage of messages on connection
+ - An application that wanted a deeper view into the workings of the hydra protocol would have no access to several of the internal event types 
+
+Additionally, much of the changes that would need to happen to solve any of these would apply equally well to the on-disk persistence the Hydra node currently provides.
+
+-------------------------------------
+
 - Current client API is a stateful WebSocket connection. Client API has its own incrementally persisted state that must be maintained, and includes a monotonically increasing `Seq` number, but this is based off the events dealing with the client, not the host hydra-node.
 
 - Client must be flexible and ready to handle many different events
@@ -32,26 +44,22 @@ N/A
 
 # Decision
 
-- Persistence event log will have a local observed event order written into each event, as a monotonic integer (global event ID). This should be made explicitly different from any type of `Seq` monotonic value. It will always be delivered in stateless client mode, to resume connections. We can assume Event Server will come up with a value for this, either obtained from message queue like Kinesis sequence number, or in Direct chain mode, just incrementing a counter
-
-- Translation component will be called Event Server, as a generalization of Chain Server above
-- EventServer should receive StateChanged Tx (and resulting integrated HeadState value for fast resume). CBOR/JSON encoded
-- EventServe should submit Event values, especially ClientEvent, as mentioned below, to represent new layer 2 transactions submitted. CBOR/JSON encoded
-- To start off with, we can have an Event Server that just does all the old persistence refactored into it, but with any new changes we need. That way we don't need to change everything at once, we can focus on Ledger state in Event type (ClientEvent, OnChainEvent Observation constructors)
+A new abstraction, the EventServer, will be introduced. Each internal hydra event will be sent to the event server, which will be responsible for persisting that event and returning a durable monotonically increasing global event ID. Different implementations of the event server can handle this event differently. Initially, we will implement the following:
+ - MockEventServer, which increments a counter for the ID, and discards the event entirely
+ - FileEventServer, which increments a counter for the ID, and encapsulates the existing file persistence mechanism
+ - SocketEventServer, which increments a counter for the ID, and writes the event to an arbitrary unix socket
+ - WebsocketBroadcastEventServer, which broadcasts the publicly visible events over the websocket API
+ - MultiplexingEventServer, which has a primary event server (which it writes to first, and uses its ID) and a list of secondary event servers (which it writes to in sequence, but discards the ID)
 
-<!-- This is sorta close enough to the current client API that we're already using, that I'm not sure if it's worth adding this, just to simplify offline mode lifecycle
-* Full duplex connection to a GummiWorm Event Server component, can be upgraded to message broker Event Server if we deem it worth it
-    * Only event types relevant to Gummiworm is transaction recieved / approved as valid / rejected & UTxO state, so these are all that will be delivered
-    * Client may not change any parameters of the subscription, just resume with a new connection
-    * No guarantee of acking in case a connection dies, but this is not an issue as the client can simply reinitiate a connection from the last point it remembered, and stream new events, and resubmit anything
-    * Global event order of events the stateless client sees will be always increasing, but may not increase by the same amount (monotonic, not monotonically increasing). Seq count will do the latter, but will only be consistent within a single session
-    * Transactions can be dropped off as CBOR, if configured in the query string.
-    * Very simple protocol: client sends to Event Server indefinite encoded stream of CBOR transactions, optionally with object wrapping them to support heart beats or any other necessities. Event Server sends back simple CBOR/JSON object with UTxO snapshot, global event ID, tx validation response -->
+New configuration options will be introduced for choosing between and configuring these options for the event server. The default will be a multiplexing event server, using the file event server as its primary, and a websocket broadcast event server as its secondary. 
 
 ## Consequences
-- Interface to further modularize Hydra and standardize all the ways it interacts with the world
-- Ledger integrated with main persistence event stream of StateChanged, so we can progressively hook up more and more of the persistence event stream
 
-    - Incidentally already where we're hooking stuff for UTxO writeback, see createPersistenceWithUTxOWriteBack in offline mode PR
-    * In offline mode PR: we can replace createPersistenceWithUTxOWriteBack with a Chain Server connecting to a message queue. Event server provides us, later down the line, a nice place to hook into and use kinesis/kafka, with something like log compaction/rollup to store the full HeadState for a given event history (like before event sourcing change, see loadState still in main branch), which would give us fast restarts without adding another file to persistence directory. Not as in recreating the state, just including the full NodeState in the Websocket message to Event Server
-    * Much more short term it allows us to not have to subscribe to full Hydra client API and instead write a Chain Server to handle transactions in GummiWorm
+The primary consequence of this is to enable deeper integration and better operationalization of the Hydra node. For example:
+- Users may now use the SocketEventServer to implement custom integrations with existing ecosystem tools
+- To avoid the overhead of a unix socket, they may submit pull requests to add integrations with the most popular tools
+- Developers may more easily maintain downstream forks with custom implementations that aren't appropriate for community-wide adoption
+- Developers can get exposure to events that aren't normally surfaced in the websocket API
+- Logging, metrics, and durability can be improved or tailored to the application through such integrations
+
+Note that while a future goal of this work is to improve the websocket API, making it more stateless and "subscription" based, this ADR does not seek to make those changes, only make them easier to implement in the future.