This repository is forked from CMS Containerised Conda environments
which is an approach to deploying and maintaining large conda environments
while reducing inode usage and increasing performance. It takes advantage of
singularity
's ability to manage overlay and SquashFS filesystems.
Each conda environment is consolidated into its own SquashFS file, and then
one or more of these SquashFS environments are loaded using components
of the environment. For more information on the CMS containerised environments,
see the Conda hh5 environment setup page on the CMS Wiki.
The environments in this repository are deployed to NCI on Gadi and can be activated using Environment Modules. To use the released modules, ensure you are a member of the vk83
project. If not, see how to join an NCI project.
There are currently two environments configured in this repository:
payu
:
This a released payu environment with a version matching a tagged version of the payu-org/payu
repository. To load this module, run:
module use /g/data/vk83/modules
module load payu # To load the latest tagged version
# To load a specific version of payu, instead run: module load payu/VERSION, e.g. module load payu/1.1.6
payu --help # Payu commands can now be run
payu-dev
:
This is the development payu environment that has the latest changes on the payu-org/payu
repository's master
branch. To load this module, run:
module use /g/data/vk83/prerelease/modules # Note the prerelease directory
module load payu/dev # To load the latest built version
If you encounter any problems running the above environments, please submit an issue here.
Below contains instructions for developers on updating the conda environments and some general notes on how all the scripts and workflows interact.
Each conda environment has a subdirectory in environments/
with the same name as the environment. There should be the following files in each subdirectory `:
-
environment.yml
: The conda environment file -
config.sh
: This script sets up environment variables required for the build, test, and deploy scripts. It requires at least the following to be set:FULLENV
- Name of the SquashFS file and script directories related to an environment version, e.g.payu-1.1.6
MODULE_VERSION
- By default the module name isconda
and so if theMODULE_VERSION
ispayu-1.1.6
, the module can be loaded bymodule load conda/payu-1.1.6
. Note that for payu, pre-existing modules were loaded usingmodule load payu/1.1.6
. To maintain this naming scheme,MODULE_VERSION
is set to1.1.6
, andCONDA_MODULE_PATH
overrides the default variable in/scripts/install_config.sh
.
-
build_inner.sh
(optional): This script can modify environment files during the build scripts before the environment is compressed into a SquashFS file. -
deploy.sh
(optional): This script runs at the end of the general deploy script after the environment components have been deployed. It is useful for updating the stable or default alias for a modulefile. -
testconfig.yml
: This is a configuration file used in the test scripts. This file specifies Python modules to skip loading, ignore exceptions, and modules to preload before testing.
-
Clone this repository and create and checkout a new branch.
-
Modify the payu version in the conda environment file -
environments/payu/environment.yml
. If payu version is not set, it'll use the latest payu version inaccess-nri
conda channel. -
Modify
VERSION_TO_MODIFY
andSTABLE_VERSION
inenvironments/payu/config.sh
.VERSION_TO_MODIFY
builds a payu environment with this version, andSTABLE_VERSION
is used in deployment to set the default version alias for the payu module. In most cases these should be set to the same value. -
Add any new packages or versions to
environment.yml
required for the next release. -
Push the changes and open a Pull Request to the
main
branch of this repositoryACCESS-NRI/model-release-condaenv
(not the upstreamcoecms/cms-conda-singularity
repository). -
Once a Pull Request is open, it will request a sign-off to run the Build and Test job on the
Gadi
Github Environment. These jobs are performed in temporary locations on Gadi, and so will not affect the production environments. -
Once the CI jobs have been completed successfully and the Pull Request has also been approved, the branch can be merged, and then the Deploy job will request to run again on the
Gadi
Github Environment. -
Once deployed, check loading the new module with:
module use /g/data/vk83/modules module load payu/<VERSION>
Where
<VERSION>
is the version installed.
The deploy payu-dev
Github workflow runs every day to check for new commits to the payu repository's master
branch. It can also be triggered manually to add updates to packages, or once a Pull Request is merged that modifies the payu-dev
environment files.
To modify the environment via a Pull Request:
-
Clone this repository and create and checkout a new branch.
-
Add packages or version constraints in the conda environment file
environments/payu-dev/environment.yml
. -
Unlike when updating the
payu
environment above, make sure to leaveVERSION_TO_MODIFY=dev
inenvironments/payu-dev/config.sh
unchanged as a version is generated at the time of deployment to include the date-time, and the latest payu commit (e.g.dev-20250116T002805Z-20a8e76
). -
Push the changes and open a Pull Request to the
main
branch of this repositoryACCESS-NRI/model-release-condaenv
(not the upstreamcoecms/cms-conda-singularity
repository). -
Once a Pull Request is open, it will request a sign-off for a Build and Test job that runs on the
Gadi Prerelease
environment. -
When the Pull Request is merged, a custom
payu-dev
deploy workflow will run which will re-run the build and test scripts with the generated environment version before running the deploy scripts. This is to ensure the environment deployed contains the latest changes from thepayu
repository. -
Once deployed, check loading the new module with:
module use /g/data/vk83/prerelease/modules module load payu/dev
Check the
payu/dev
environment contains the updates specified in the merge pull-request.
Given a base install directory defined by CONDA_BASE
(see Github Environment settings), the structure of apps and modules deployed would look something like the following for payu/1.1.6
├── apps
│ ├── base_conda
│ │ ├── bin
│ │ │ └── micromamba # Micromamba used to install environments in build scripts
│ │ ├── envs
│ │ │ ├── payu-1.1.6 -> /opt/conda/payu-1.1.6 # Symlink to conda environment path inside the container
│ │ │ ├── payu-1.1.6.sqsh # SquashFS file which contains the conda environment
│ │ └── etc
│ │ └── base.sif # Base singularity image
│ └── conda_scripts
│ ├── launcher_conf.sh
│ ├── launcher.sh
│ ├── overrides
│ └── payu-1.1.6.d # Launcher scripts for the environment
│ ├── bin
│ │ ├── payu -> launcher.sh
│ │ ├── python3 -> launcher.sh
│ │ ├── # Launcher script symlinks for every entry point in the conda environment
│ │ ├── launcher_conf.sh
│ │ ├── launcher.sh # Runs commands inside the containerised environment
│ └── overrides
│
└── modules
└── payu
├──.common_v3 # The common Environment Modulefile
├──.1.1.6 # Custom module insert
└──1.1.6 -> .common_v3
Running module load payu/1.1.6
adds the launcher scripts in apps/conda_scripts/payu-1.1.6.d/bin
to the $PATH
. It also sets up environment variables as if micromamba activate
was run on the micromamba environment. SINGULARITYENV_PREPEND_PATH
is set to /opt/conda/payu-1.1.6/bin
which modifies $PATH
only inside a singularity container.
If outside the container, running payu setup
command calls the launcher script (e.g. apps/conda_scripts/payu-1.1.6.d/bin/payu
), which runs a singularity exec
command that uses the base container in apps/base_conda/etc/base.sif
and adds the overlay of the SquashFS file apps/base_conda/envs/payu-1.1.6.sqsh
. The SquashFS is a read-only file system that is mounted to the base container at runtime. This means all the files in /opt/conda/payu-1.1.6
are now accessible inside the container. The launcher script then runs payu setup
command inside the container, (e.g. /opt/conda/payu-1.1.6/bin/payu
).
If the payu setup
launcher script is invoked inside the container, it will instead run the payu setup
command directly.
General overview of the bash scripts used in the Github workflows:
-
Build script (
scripts/build.sh
)- Launches the base singularity container and mounts
/g
directory to a temporary directory on/jobfs
so it does not affect the production environments but inside the container, it is as-if building directly to the deployed locations. Then:- installs a base micromamba, if it doesn't already exist.
- builds the micromamba environment
- creates all the launcher scripts
- sets up the modulefiles
- runs any custom environment
build_inner.sh
scripts
- Creates a SquashFS file of the environment and stages this file (in
$ADMIN_DIR/staging
). - Once the build is done, archives the environment components into a tar file and stages this file.
- Launches the base singularity container and mounts
-
Test Script (
scripts/test.sh
)- Unpacks the archived tar file of the built environment components to a temporary location.
- Launches the container with the built SquashFS overlay.
- Inside the container, run pytests that attempt to import all accessible Python modules. This uses settings defined in an environment's
testconfig.yml
file.
-
Deploy Script (
scripts/deploy.sh
)- Unpacks the archived tar file of the built environment components to a temporary location.
- Rsyncs the apps and modules directories, and moves the SquashFS file to the deployed locations.
- Stores the archived tar file in the admin directory.
- Runs any custom environment
deploy.sh
scripts.
On a Pull Request to the main
branch of this repository, it will check for changes to any conda environment configuration files (under environments/
). If changes are detected, it will build a base singularity container on a Github runner, and request a signoff to deployment to a Github Environment. Once approved, it will rsync the repository and the built base container to Gadi, then run the build and test scripts which are submitted as PBS jobs.
Once a Pull Request is merged, it will request a sign-off to deploy to a Github Environment. Once approved, it will run the deploy script on a login node. This workflow does not run for the payu-dev
environment which is instead deployed by the below workflow.
This workflow runs:
- when there's a merged Pull Request that modifies any
payu-dev
environment files - if there have been recent commits to
payu-org/payu
repository'smaster
branch. This runs on a cron every morning. - if the workflow has been triggered manually
It will build a base singularity container on a Github runner. Then it will generate the version for the environment with the format dev-$DATETIME-$GIT_COMMIT_HASH
, and modify the versions in the payu-dev
environment files. It will then request signoff to the Gadi Prerelease
environment. Once approved, it will rsync the repository with the modified environment files down to Gadi. It will re-run the build and test scripts so the built files use the generated dev version. It'll then deploy all required files to the Prerelease location on Gadi. It'll set payu/dev
module alias to point to the deployed module version. Finally, it'll delete all old versions of payu/dev-*
environment components, except the recently deployed version, and the previous payu/dev
version which could be being actively used at the time of the deployment.
There are currently two Github Environments for deployment: Gadi
and Gadi Prerelease
. Any build, test, and/or deploy workflows that need to run on Gadi, currently require a sign-off.
Repository settings:
vars.PRERELEASE_ENVIRONMENTS
: Space-separate list of quoted environments to deploy toGadi Prerelease
, (e.g."payu-dev"
)
Secrets required for SSH:
secrets.HOST
: Hostname for the environmentsecrets.HOST_DATA
: Hostname for the data moversecrets.SSH_USER
: Username for the account used for deploymentsecrets.SSH_KEY
: Private SSH key for the user
Variables required for Build/Test/Deploy workflows:
secrets.REPO_PATH
: Directory path to sync repository to. This directory is where all build and deploy scripts are sourced from. Note: Directory can't be in the/g/data
directory as/g
is mounted to a/jobfs
directory during the build scripts so the repository would not be accessible.vars.CONDA_BASE
- Base deployment directory that containsapps/
andmodules/
subdirectories (e.g forGadi
:/g/data/vk83
and forGadi Prerelease
:/g/data/vk83/prerelease/
). Note: Currently, the build scripts expectCONDA_BASE
directory to be in/g/data
.vars.ADMIN_DIR
: Directory to store staging and log files, and tar files of deployed conda environments (e.g./g/data/vk83/admin/conda_containers/prerelease
)vars.APPS_USERS_GROUP
: Permissions of read and execute for files installed to apps and modules (e.g.vk83
)vars.APPS_OWNER
: User with read, write, and execute permissions for installed filesvars.PROJECT
: Project code used for build and test PBS jobs (e.g.tm70
)vars.STORAGE
: Storage directives for build and test PBS jobs (e.g.gdata/vk83
)