Stateless Aerie - Prototyping & System Design #1506

dandelany · 2024-07-18T00:05:20Z

Description

Per discussions with @ewferg, some users would like an officially-supported way to use Aerie's simulation and scheduling engines as a simpler "stateless" and "headless" program.

Stateless means the user runs Aerie simulation/scheduling with some inputs, Aerie exits and returns some output, but nothing is persisted in the Aerie app/DB beyond that point - as opposed to our current standard Aerie deployment, which is a persistent web service
Headless means without a UI, ie. likely a CLI which reads from stdin and writes results to stdout when complete.

Requirements

We don't yet have all of the requirements or implementation fully planned out yet, but I wanted to open this to track our progress. I will meet with @ewferg soon to discuss - please comment here or in Slack if you're interested in participating. As a starting point, here's a rough draft of some requirements:

Users should be able to run Aerie's simulation engine alone in a stateless way, by running a command and providing two inputs: a mission model JAR file, and an Aerie plan file containing activity directives. Aerie should run the simulation, output any useful results (grounded activity instances, profiles, other TBD?) and exit.
Users should also be able to run scheduling alone statelessly, by running a command and providing a mission model, an Aerie plan, and scheduling goals as inputs. Aerie should execute the scheduling goals (possibly running more simulations in the process), output useful results, and exit.
For a reasonably small model/plan/set of goals (to be defined), stateless simulation should be able to run in < 10 seconds, and stateless scheduling should run in < 30 seconds, to be competitive with existing tools that users are using today instead of Aerie.

Notes & Questions

Our first goal is to determine how feasible this is given our current architecture, & the level of effort it will take. Once the requirements are further fleshed out, we'll meet with devs to come up with a proposed implementation.
Can stateless sim/scheduling be done without running our (hasura/postgres) database service, since nothing is being persisted, or is the DB too tightly coupled to Aerie sim/scheduling for this to work? The performance requirements will be difficult to meet if we have to wait for Hasura's significant "spin-up" time.
Stateless scheduling will be required to support procedural scheduling JARs. Does it also need to support EDSL goals/interleaving like the UI does?
Is there any overlap here with aerie-cli functionality, and should we consider taking ownership of aerie-cli and implementing this feature there, or is this a part of Aerie core, or a new tool altogether that interfaces with Aerie core?

The text was updated successfully, but these errors were encountered:

ewferg · 2024-07-18T00:57:06Z

A couple additional notes and questions to consider:

What should the input/output format look like for sim results? Should it be a separate file or additional data in the file that contains the plan?
Should constraint checking also be supported in addition to simulation and scheduling? If so what would the output for this information look like? I think in order to support this, we would need procedural constraints.

joshhaug · 2024-07-18T01:02:27Z

Seconding this use case. A headless Aerie would make it a much more viable drop-in replacement in legacy ops planning toolchains. I’m particularly thinking of SMAP, but there are likely several other missions that could use such a feature.

Exposing some standard driver interface that lets a user run procedural scheduling, perform a simulation, grab the decomposed activities’ computed attributes, export resource timelines, etc. would be very useful.

Some of this wrapper code has definitely been implemented on a per-adaptation level, but having it as part of the core would be stellar.

dandelany · 2024-07-23T00:09:35Z

Met with Aerie developers today to start sketching out implementation details for this... Some conclusions from this:

We think it is feasible to build a stateless Aerie, it will just require some refactoring of code to support the different use cases
It should be possible to use Aerie simulation and scheduling without any Postgres database or Hasura service
Using Docker vs. not?
- Docker is easier to distribute/fewer dependencies but has performance penalty, especially initial startup time
- Therefore we want to distribute headless Aerie as a JAR file, maybe with a wrapper script that makes it easier to call
- It can be a "fat JAR" that includes all libraries, so the only system dependency should be Java
Supporting Scheduling/Constraints/Command Expansion with EDSLs?
- would require node as an additional system dependency
- Expect to not support any of these for now
- But we should support procedural scheduling, passing [paths to] procedure JARs into headless Aerie
- and possibly Java procedural constraints in the future, if we add them to Aerie
Inputs to headless Aerie:
- For simulation: (paths to) Mission model JAR, plan.json file, sim config file
- For scheduling: same as sim, + scheduling procedures (JAR files) + parameters for procedures
  - Could specify procedures + their parameters in a single JSON file instead of a bunch of flags
- How to use other types of files as inputs?
  - Responsibility of mission model - it can fetch other files from local filesystem if needed
- Any need to (UNIX) pipe inputs into aerie? No, doesn't really make sense
What are the outputs?
- For simulation: Just simulation results
- Don't have a canonical JSON format for sim results - we should define one as a part of this task
What form do outputs take?
- ie. Write to files or stdout? Streaming output vs. write once at the end?
- Should strive to have the tool output everything on stdout unless benefits of doing otherwise are large, for sake of modularity
- JSON doesn't lend itself well to streaming output, can't really stream JSON to stdout during simulation, & scheduling which will be interleaving different kinds of results
- Naive version of "write at the end" approach requires having all results in memory, which doesn't scale well. But could be good enough for MVP?
- Proposal: define an internally-used format for sim/scheduling results that is not JSON, instead something that is easily streamable. Stream this to a temporary file during sim, then at the end use these files + a JSON string-builder to put final JSON output on stdout in a performant way.

ewferg · 2024-07-24T03:17:20Z

I just want to throw one more thing out there that I think is reasonable to include in our trade space. While we could implement a standalone command line stateless Aerie, it may be sufficient enough to provide a template that users could use to build their own command line utility. This template would include a basic main() function, helper functions to orchestrate simulation and scheduling (and potentially constraint checking in the future), and helper functions for reading/writing data. This would require users to use an IDE to build their own stateless Aerie, but give them more control to design their stateless Aerie to meet their specific needs.

For example, in the SMAP use case, plan and simulation data output is actually not what's important to them. Instead it's sequences embedded in contributed attributes within activities. In this example, SMAP could design their stateless Aerie to simply write out the sequences, which would require some mission-specific code, instead of having to write a separate script to do that work. Admittedly another way to do this would be to write a scheduling procedure that writes out the sequences and run that procedure last, which could be done with a generic stateless Aerie utility that provided an orchestration script. This, in fact, is the way that APGen worked via command line.

Another question I have is whether a user could bundle model and procedure jars together so as not to have to input a bunch of procedure jars separately into a command line utility. It's not uncommon to have 10s (maybe even 100s) of scheduling procedures and dealing with all of those separately would be painful.

Mythicaeda · 2024-07-30T20:48:09Z

Headless Aerie is now demoable for simulation, complete with JSON Sim Results.
Open tasks:

Catching and returning well-formatted exceptions (current behavior is that a sim exception is completely uncaught, meaning we aren’t getting info like the throwing directive id/stack traces out)
Allowing users to cancel running simulations (machinery is hooked up, we just need write the listening code)
Some tweaks to how we write results (current behavior has them built as a large JSON object in memory, but the code was written with the expectation that we’d update that to be writing straight to std out instead)
Change the resource manager to send resources out to temp files instead of using InMemory
Minor but I want to think about tweaking the SimExtentConsumer to not print out every single time the engine steps forward but only every few updates/the value every few seconds (this would make it match the behavior of the DB variant)

Mythicaeda · 2024-08-01T23:26:16Z

Discussed briefly with @JoelCourtney what ought to be expected from scheduling. We agreed that an updated plan.json makes sense as the default output, with an additional flag to optionally write the final sim results to a file. Scheduling goal satisfaction did not come up.

mattdailis · 2024-08-02T18:43:17Z

Suggestion from demos: Consider adding a validate subcommand that runs validation on each activity in the plan.json

dandelany · 2024-09-12T00:03:19Z

To recap some past progress since the last update:

A beta/early-release version of Stateless Aerie is merged & included in the Aerie v2.19.0 release (thanks @Mythicaeda!)
Not yet documented, we'll work on docs for a later production-ready release.
So far Stateless Aerie includes support for running simulation only, no scheduling yet.
In addition to the CLI, this release also includes support for running user-provided orchestration scripts, as outlined in @ewferg 's comment above
- This is done via a new Java package called orchestration-utils which allows users to import functions/classes they can use to control simulation via their own Java code instead of the CLI
Some shared types were moved into a new package called type-utils & we plan to likely move more types there in the future.

We met today to talk about next steps for supporting running procedural scheduling goals using Stateless Aerie. Meeting notes:

We plan to allow users to run a list (specification) of scheduling goals in Stateless Aerie
Only procedural goals will be supported, not eDSL goals, at least for now.
- Much more work to support eDSL & we'd like Stateless Aerie to be Java only; eDSL requires Node/JS to compile
Users will be able to run procedural scheduling goals via the CLI or with their own code, using new functions we'll add to the orchestration-utils package
Proposed inputs and outputs for calling scheduler from the Stateless CLI:
- Inputs:
  - model JAR
  - Plan JSON
  - goal specification JSON: goal list/parameters
    - eg. [{goaljar: file.jar, order: 1, params: {}, simulateAfter: bool}]
  - optional: SimResultsJSON? (staleness concern)
  - optional: sim config json
  - optional: number of engines setting (defaults to same as in docker)
  - optional: Verbosity flag?
  - optional: forceSimAtEnd flag
    - ensures sim runs after last goal, does nothing if simulateAfter is true for last goal in spec
- Outputs:
  - Plan JSON (modified by scheduling goals)
  - Optional: scheduling goal satisfaction
  - Optional: simresultsJSON
    - probably dont call sim again to get this (add an input flag)?
    - corresponding filename input/return
- (inputs/outputs for the orchestration-utils function(s) will be similar, just slightly different formats)
In addition to providing procedural goals as JAR files, we'd like to also allow users to provide them as Java classes, which could make things easier & require fewer compilation steps.
- This feature will likely not be included in our next release (unless easier than expected), but may be added in a follow-up task.

dandelany · 2024-09-17T00:04:22Z

One small issue we noticed with Stateless Aerie during testing with @Mythicaeda today: When you run a simulation, the events portion of the SimulationResults object that gets returned is supposed to look like:

{ ..., "events": [ EVENTS HERE ] }

Instead, it currently returns this wrapped format:

{ ..., "events": { "event": [ EVENTS HERE ] } }

dandelany added feature A new feature or feature request design Issues related to design tasks next labels Jul 18, 2024

dandelany self-assigned this Jul 18, 2024

dandelany assigned dandelany, Mythicaeda and goetzrrGit and unassigned dandelany Jul 18, 2024

dandelany removed the next label Aug 13, 2024

Mythicaeda mentioned this issue Aug 20, 2024

Stateless Aerie - Simulation #1536

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Stateless Aerie - Prototyping & System Design #1506

Stateless Aerie - Prototyping & System Design #1506

dandelany commented Jul 18, 2024 •

edited

Loading

ewferg commented Jul 18, 2024 •

edited

Loading

joshhaug commented Jul 18, 2024

dandelany commented Jul 23, 2024

ewferg commented Jul 24, 2024

Mythicaeda commented Jul 30, 2024 •

edited

Loading

Mythicaeda commented Aug 1, 2024

mattdailis commented Aug 2, 2024

dandelany commented Sep 12, 2024 •

edited

Loading

dandelany commented Sep 17, 2024 •

edited

Loading

Stateless Aerie - Prototyping & System Design #1506

Stateless Aerie - Prototyping & System Design #1506

Comments

dandelany commented Jul 18, 2024 • edited Loading

Description

Requirements

Notes & Questions

ewferg commented Jul 18, 2024 • edited Loading

joshhaug commented Jul 18, 2024

dandelany commented Jul 23, 2024

ewferg commented Jul 24, 2024

Mythicaeda commented Jul 30, 2024 • edited Loading

Mythicaeda commented Aug 1, 2024

mattdailis commented Aug 2, 2024

dandelany commented Sep 12, 2024 • edited Loading

dandelany commented Sep 17, 2024 • edited Loading

dandelany commented Jul 18, 2024 •

edited

Loading

ewferg commented Jul 18, 2024 •

edited

Loading

Mythicaeda commented Jul 30, 2024 •

edited

Loading

dandelany commented Sep 12, 2024 •

edited

Loading

dandelany commented Sep 17, 2024 •

edited

Loading