Skip to content

Latest commit

 

History

History
294 lines (198 loc) · 11.1 KB

2023-03-integration-with-station.md

File metadata and controls

294 lines (198 loc) · 11.1 KB

Zinnia in Filecoin Station

Status: ACCEPTED

Context

Zinnia is a runtime for Filecoin Station modules. It's time to integrate Zinnia into the Station to enable the deployment of the first modules.

Zinnia is designed to run multiple modules inside the same process, using V8 Isolates for keeping modules separated and sandboxed. The integration with libp2p (and later IPFS) is designed to allow multiple modules to share the same underlying set of network connections and block storage.

Zinnia has two primary modes of operation:

  1. A developer tool for building a single module. We want to optimize for ease of use and a fast feedback loop.

    In this mode, Zinnia is typically started and stopped frequently, running for short periods at a time, executing a single module only.

    As a CLI tool, it reads the configuration from a config file (either user-level or project-specific) or CLI arguments. There should be reasonable defaults tailored to developers building a module - e.g. the state files should be stored in the project's working directory.

  2. A deployment platform, running modules inside Filecoin Station instances operated by non-technical users. We want to optimize for a smooth user experience for Station operators first, and easy troubleshooting of operations by module developers second.

    In this mode, Zinnia is a long-running process executing multiple modules. It needs to support reloading individual modules when a new version of a module is deployed. It also needs to report various information back to the Station, e.g. activity log and the number of jobs completed.

    As a service, it should read the configuration from environment variables (see The Twelve-Factor App) and allow embedders to improve security e.g. by keeping modules in a different place from the runtime state.

zinniad

For the reasons above, I propose to build a new binary called zinniad that will execute Zinnia modules inside the Station.

(This is a strawman proposal, subject to changes during implementation.)

Configuration via environment variables:

  • FIL_WALLET_ADDRESS: Address of Station's built-in wallet (required).

  • ROOT_DIR: Directory where to keep state files (optional). Defaults to a platform-specific location, e.g. $XDG_STATE_HOME/zinniad on Linux.

    Note: ROOT_DIR was superseded with CACHE_DIR and STATE_DIR, see Filesystem Conventions ADR.

Positional arguments:

  • Positional arguments specify which modules to run, where each module is a single JS file. We don't make any assumptions about the directory layout of modules. Paths are resolved relatively to the current working directory.

Example invocation:

cd /Applications/Filecoin\ Station.app/Contents/Resources/zinnia-modules

FIL_WALLET_ADDRESS=f1etc \
ROOT_DIR=$HOME/Library/Caches/Filecoin\ Station/zinnia \
zinniad \
  saturn-l2/main.js \
  ping.js \
  retrieval-checker/dist/index.js

Communication with Station (Core, Desktop):

As explained above, I propose to use environment variables to pass configuration from the Station to Zinnia.

For communicating information from Zinnia back to the Station, I propose that Zinnia prints newline-delimited JSON entries to stdout and uses stderr for general/debug logging. This format is easy to parse from Node.js code powering both Station variants.

Example messages:

  • Activity log - error

    {"type": "activity:error", "module": "saturn", "message": "Cannot connect to the orchestrator."x }

    Note: "module": "saturn" describes which module emitted the log.

  • Activity log - info

    {"type": "activity:info", "module": null, "message": "Zinnia is starting up..."}

    Note: "module": null means the message comes from Zinnia runtime.

  • Number of jobs completed:

    {"type": "jobs-completed", "total": 123 }

    Note: This message is emitted periodically, e.g. every 200ms.

    In the future, we can easily extend this line to include per-module stats too:

    {"type": "jobs-completed", "total": 123, modules: {"saturn": 100, "retrieval-checker": 23}}

  • Messages logged via Console APIs like console.log are printed to stderr without any modifications.

    Pinging /dnsaddr/bootstrap.libp2p.io/p2p/QmQCU2EcMqAqQPR2i9bChDtGNJchTbq5TbXJJ16u19uLTa
    RTT: 1252ms
    Cannot record stats: Error: InfluxDB API error 401
    {"code":"unauthorized","message":"unauthorized access"}
    

Module identifiers

We need each module to have a unique identifier (a name) that we can use in the messages above. This id must remain unchanged across module version upgrades.

For the initial version, these ids will be hard-coded human-readable names like saturn-l2.

In the future, when we move towards untrusted modules deployed in a decentralized manner, we will need to find a different way how to derive these unique module ids. That's out of scope of the current work though. Adding new id types should be easy as long as our architecture supports arbitrary string ids.

Deploying and upgrading modules

The initial version will not implement any upgrade mechanisms for modules. Both zinniad and all module sources will be bundled inside the Station. When a new module is added or an existing module is upgraded to a new version, we will publish a new version of the Station.

Zinnia API for module builders

(This is a strawman proposal, subject to changes during implementation.)

namespace Zinnia {
  // omitted: existing APIs like `peerId`

  /** Get the wallet address, this value is typically provided by the Station. */
  walletAddress: String;

  /** Report activities to the Station. These messages are displayed in the main UI. */
  activity: {
    /** Report an informative status update, e.g. "Connecting to the network." */
    info(message: string);

    /** Report an error, e.g. "Cannot connect to the orchestrator." */
    error(message: string);
  }

  /** Report completion of a single job */
  jobCompleted();
}

Dev-mode in zinnia

These APIs will behave differently when running a module via zinnia CLI in development.

  • In the initial version, the wallet address is hardcoded to a dummy testnet address t1abjxfbp274xpdqcpuaykwkfb43omjotacm2p3za. This value is taken from Filecoin Lotus docs with the leading "f" replaced with "t".

    Later, we can implement reading of the wallet address from a configuration file, e.g. .zinnia/config.yaml in the current working directory (typically the project root).

  • Activity logs are printed to stdout with human-readable formatting.

    [10:30:20.000  INFO] Connecting to the network.
    [10:30:21.000 ERROR] Cannot connect to the orchestrator.
    
  • Job completions are printed to stdout but less frequently, e.g. every 500ms.

    [10:30:20.000 STATS] Jobs completed: 123
    [10:30:20.500 STATS] Jobs completed: 134
    [10:30:21.000 STATS] Jobs completed: 146
    

Options Considered

  1. Don't build a new binary, bundle the existing zinnia CLI inside the Station.

    Pros:

    • Less work to ship the first version. Less yak-shaving like setting up CI/CD workflows.

    Cons:

    • Zinnia modules cannot share libp2p & IPFS resources (network connection, peer address book, block store).

    • To meet the requirements of both module builders and the Station runtime, we would need to implement extra configuration options to get different behaviour in different settings.

    • Most of the Station work will be discarded later, once we need zinniad to allow running multiple modules inside the same runtime process.

  2. Don't push job stats via stdout, let the station pull the stats via HTTP API. (The current saturn-l2 module uses this model.)

    Pros:

    • We already have code in Station to deal with this.
    • Less cluttered stdout

    Cons:

    • More complex implementation in Zinnia for little benefits. Since zinniad routes all console.log messages to stderr, no humans should be reading stdout, therefore extra clutter does not matter.

    • More complex integration between Station and Zinnia: Zinnia needs to report URL where the stats API is available, the Station needs to parse that URL from Zinnia's stdout.

  3. Include a timestamp in the JSON messages printed to stdout for the Station. We decided this is not needed now and can be easily added later if such a need arises.

    Our current goal is integration with the Station. The Station (Core or Desktop) and the Zinnia runtime will initially sit on the same machine. If the log consumer sits on the same machine, it shouldn't matter who will attach the timestamp.

  4. Use XDG_STATE_HOME to configure where should zinniad keep the state files.

    The major difference between XDG_STATE_HOME and ROOT_DIR is that XDG_STATE_HOME provides a system (or user) wide directory, we need to append a zinnia-specific segment to that path to obtain ROOT_DIR.

    I prefer to give the user full control over the location by providing them ROOT_DIR config option.

    This becomes relevant when Zinnia is running inside the Station. If we use XDG_STATE_HOME, then we will keep the state in XDG_STATE_HOME/zinnia, a different place from where Station keeps its files. We could make this path Station specific, but that feels hacky to me and incorrect in the situation when Zinnia runs outside of the Station

    However, I think it's a good idea to make ROOT_DIR an optional configuration option and use XDG_STATE_HOME/zinnia as the default value.

    Few more thoughts:

    • XDG_STATE_HOME is specific to Linux. Windows uses something like %LOCALAPPDATA%, macOS something else.

    • We already provide ROOT_DIR for trusted modules, quoting from module interface docs:

      ROOT_DIR The long-lived working directory on disk. The module must store all of its files inside (subdirectories of) this directory. The directory isn't expected to be backed up or shared across machines in any way.

Decision

Build zinniad and add new Zinnia APIs as described above.

Consequences

After we implement the architecture described above and integrate Zinnia into the Station, we will have a solid foundation that's easily extensible and should not require major updates for a long time.