We are happy to announce Swarm 2.2.0 community release.
In this release, we have delivered key enhancements on UI/UX, which includes experiment tracking for easier “birds-eye” visualization of past training rounds, parallel Swarm installation on multiple hosts, Podman support via SLM-UI etc., that will significantly enhance user experience. We have also added powerful features to Swarm manageability framework for better management of user ML workloads.
Customers can download product bits and documentation from My HPE Software Center
Features
• Targeted SWOP command used to target the task on a specific SWOP node.
-Dynamic addition of peers to an ongoing task execution.
-Retrying the failed Task on a SWOP node.
• WITH ALL PEERS command to trigger a task execution on all available peers.
• UI/UX Enhancements
-Experiment tracking support to display the training attributes for multiple training rounds.
-Parallel Swarm installation - Option to add multiple hosts simultaneously.
-View SWOP profile and task yaml.
-Support Podman.
• Swarm support for SPIRE as certificate manager.
-Added CLI based SPIRE example (spire/cifar10).
• Real world NIH example – Added new example to show case Swarm use case with real world NIH dataset.
• Documentation enhancements
Defect fixes
• Stale SL Admin node stuck waiting for quorum while a new Admin is selected.
• Enabled non-default APLS port support from SLM-UI.
• Issues during re-start of SLM-UI container while running a training.
You can see the updated documentation for all new feature/defect fixes here .
For help/clarifications, reach out Slack : https://hpe-external.slack.com/archives/C02PWRJPWVD