Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CI configuration file scheduled branch regex #118

Open
jo-basevi opened this issue Feb 26, 2025 · 10 comments
Open

CI configuration file scheduled branch regex #118

jo-basevi opened this issue Feb 26, 2025 · 10 comments
Assignees

Comments

@jo-basevi
Copy link
Collaborator

Currently, scheduled tests are defined in the config/ci.json file on the main branch of model configuration repositories.

When a scheduled test workflow is triggered, it runs a test for every key in the "scheduled" section.

- name: Get all released configs
id: get-released-config
run: echo "tags=$(jq --compact-output --raw-output '.scheduled | del(.default) | keys' config/ci.json)" >> $GITHUB_OUTPUT

Initially I think the idea of explicitly listing what branches/tags to run scheduled checks on was to limit the amount of tests and resources used.

Regex is used in "qa" and "reproducibility" checks for branch names to define common test configuration for dev/release branches. Does it make sense to use regex in "scheduled" checks? If "release-.*" was set, should it just store common test configuration, and be ignored when creating a matrix for scheduled tests to run?

@anton-seaice
Copy link
Contributor

Unrelated but related.

I find the behavior of running tests for all branches in the reproducability and qa sections, and only the named branches in the scheduled section confusing

Do I have a fix for that? - No :)

@jo-basevi
Copy link
Collaborator Author

I find the behavior of running tests for all branches in the reproducability and qa sections, and only the named branches in the scheduled section confusing

Good point, it isn't consistent. Maybe the section should just be the same as the reproducibility and qa sections and only hold testing configuration for if a test is triggered for that branch/tag. That would mean storing/configuring a list of branches to trigger scheduled tests somewhere. Could add it in the same file, under a different section, e.g.

   "scheduled_tests": [
        "dev-1deg_jra55do_ryf",
    ],
    "scheduled": {
        "default": {
            "markers": "checksum or checksum_slow or access_om3 or config"
        },
        "dev-1deg_jra55do_ryf": {
            "markers": "checksum or checksum_slow or access_om3 or dev_config"
        },
        "release-.*": {
            "model-config-tests-version": "0.0.12",
            "payu-version": "1.1.6"
        }
    },

Could have scheduled_tests as a Github variable on the model configs repository? It would be easily changed/configured but it is less visible and it might look unwieldy when theres several branches/tags.

@CodeGat @aidanheerdegen do you have any ideas?

@CodeGat
Copy link
Member

CodeGat commented Feb 27, 2025

I suppose we're going away from scheduled-tests-as-canaries and more towards scheduled-tests-as-default, which is an evolution.

I like the JSON snippet you have there - I think it works well. I'd probably move away from using GitHub vars to store that kind of information - it's not too visible and I think the branches that are used for testing should be a part of the repositories history.

Maybe there can be two files? config/ci.json for how things are tested (the scheduled section in your snippet, as well as qa/reproducibility), and a config/scheduled.json for what is being tested (the scheduled_tests section in your snippet)..?

@jo-basevi
Copy link
Collaborator Author

Yeah, two files would work. I think initially it was in the one file to group all the CI configuration together in one place, but I think having two files would make the distinction quite clear.

@jo-basevi jo-basevi self-assigned this Feb 27, 2025
@aidanheerdegen
Copy link
Member

Yes the point of isolating the scheduled tests is to specify exactly what it is run on a schedule. Those tests are designed to check for changes in underlying infrastructure that might change answers.

Do we have a specific use-case for why we need to change this functionality? It will help to understand what change is required.

@jo-basevi
Copy link
Collaborator Author

There's currently a bug in specifying release-.* in the scheduled section of the ci.json, as the tests will attempt to run a scheduled test using release-.* as the git tag/branch reference.

I think we still want that functionality of determining exactly what is run on a schedule. So the idea was if there's some common test settings for "release" branches, they can be stored in ci.json under a release-.*, and have a defined list of tags/branches to run scheduled checks on "config/scheduled.json". So for most configurations ci.json file would stay the exact same but there will be a new file scheduled.json that lists explicitly what branches/tags to run scheduled checks on.

@aidanheerdegen
Copy link
Member

I think I'd still like some concrete examples to help understand what is required. I think @anton-seaice may have some ideas on this front.

@anton-seaice
Copy link
Contributor

anton-seaice commented Feb 27, 2025

My main comment is that the current behaviour is that all branches trigger repro and qa tests, but only listed branches trigger scheduled tests is confusing.

e.g.

{
    "scheduled": {
        "default": {
            "markers": "checksum or checksum_slow or access_om3 or config"
        },
        "dev-1deg_jra55do_ryf": {
            "markers": "checksum or checksum_slow or access_om3 or dev_config"
        },
    },
    "reproducibility": {
        "default": {
            "markers": "checksum"
        },
        "dev-1deg_jra55do_ryf": {
            "markers": "checksum or checksum_slow"
        },
    },
    "qa": {
        "default": {
            "markers": "access_om3 or dev_config"
        },
    },
}

Runs scheduled tests on dev-1deg_jra55do_ryf branch, and default doesn't trigger scheduled tests.
But for repro and qa default triggers tests on all branches.

I guess a likely set of will be something like:

{
    "$schema": "https://github.com/ACCESS-NRI/schema/tree/main/au.org.access-nri/model/configuration/ci/2-0-0.json",
    "scheduled": {
        "release-.*": {
            "markers": "checksum or checksum_slow or access_om3 or config",
            "model-config-tests-version": "0.0.12",
            "payu-version": "1.1.6"
        },
       "release-1deg_jra55do_ryf": { } ,
       "release-1deg_jra55do_ryf_wombatlite": { } ,
       "release-regional_jra55do_ryf": { } ,
        "dev-.*": {
            "markers": "checksum or checksum_slow or access_om3 or dev_config",
            "model-config-tests-version": "main",
            "payu-version": "dev"
        },
       "dev-1deg_jra55do_ryf": { } ,
       "dev-1deg_jra55do_ryf_wombatlite": { } ,
       "dev-regional_jra55do_ryf": { } ,
    },
...

Which currently doesn't work, due to this issue.

@jo-basevi
Copy link
Collaborator Author

Runs scheduled tests on dev-1deg_jra55do_ryf branch, and default doesn't trigger scheduled tests.
But for repro and qa default triggers tests on all branches

Sorry about the confusion, I just wanted to clarify that default doesn't trigger tests for all "repro" and "qa" branches, it just stores some test-type level defaults for the test configuration - e.g. markers. The logic for triggering a "repro" or "qa" test to run in a PR is embedded in the Github CI workflows, e.g. qa tests run for any branch being merged in "dev-" or "release-". The config/ci.json is just used in those workflows to parse out some test configuration for a given target branch.

The difference with "scheduled" section currently, is the workflows parse the branch/tags to run tests from config/ci.json. So it specifies what scheduled tests to run.

@jo-basevi
Copy link
Collaborator Author

@aidanheerdegen suggested it might be worth re-thinking how the scheduled tests are configured. A potential improvement could be to have scheduled tests file (config/scheduled.json?), that also controlled how often tests are run, e.g

{
    "daily": {
        # List of branches/tags to run daily scheduled tests on
    },
    "weekly": {
        # List of branches/tags for weekly scheduled tests
    },
    "monthly": {
        # List of branches/tags for monthly scheduled tests
    }
}

This would give some flexibility on running scheduled checks, e.g:

  • more expensive tests (e.g. high resolution configs) or tags can be run less frequently,
  • if branches are being actively worked on, they can moved to be tested more frequently.

The scheduled test configurations (e.g. test markers, payu version) could be moved out of config/ci.json and into this file.

The main question is would having more flexibility around scheduled checks be useful? If so, are there other features/configuration settings that would be useful?

Currently to modify the scheduled tests frequency, requires modifying the cron in the workflow file in the model config repository:

https://github.com/ACCESS-NRI/model-configs-template/blob/c467ce026c6db79919a2f6e858b1a6dedb307e7a/.github/workflows/schedule.yml#L5-L7

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants