Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Skip pushing the git commit metadata if BUILDKITE_COMMIT_RESOLVED=true #3152

Merged
merged 6 commits into from
Feb 25, 2025

Conversation

CerealBoy
Copy link
Contributor

@CerealBoy CerealBoy commented Jan 8, 2025

Description

During the checkout phase the agent will validate the BUILDKITE_COMMIT environment variable has been set knowingly within the build already, or push the data within this job. The first job in a build is likely to be the only time this needs to actually happen, as once it is and BUILDKITE_COMMIT has been set it will be available to remaining jobs within the build. We use the meta-data API to track this, which means we're making many API calls back to buildkite.com when really, we shouldn't need to do this by checking the value ourselves.

Disclaimer: This is quite naive in how the value of BUILDKITE_COMMIT is trusted, where we assume that if the value is 40 characters long and the characters match the accepted values of a SHA then we use it.

The value for BUILDKITE_COMMIT should only be HEAD or a git SHA, so we shouldn't have to worry too much. If there's a branch that was set as the value instead, it should fail the validation and then be set anyway (as in, the previous flow is preserved). In that event anyway, or in any where the validation fails, subsequent jobs will go via the happy path as the existing flow sets BUILDKITE_COMMIT to the determined value. I don't think there will be a problem by having this change made to shortcut the process.

Screenshots

I did some additional testing to see if I could get some better data to back-up the intended claims from what this PR will accomplish. The idea is that only the first job in a build would need to go down the path of checking and setting the build metadata, if the BUILDKITE_COMMIT value is not acceptable, and in doing so sets the value so all remaining jobs in the build will skip that work. This tangibly means that we should see a performance improvement with job runs, as there are less calls back to Buildkite.

Screenshot 2025-01-09 at 14 40 49

This shows the log output for the first job, where the metadata pathway is getting triggered as the job sees BUILDKITE_COMMIT=head being the first to run.

Screenshot 2025-01-09 at 14 41 02

This shows the log output for the second job, where the BUILDKITE_COMMIT value matches the regex and is assumed to be a valid commit. This is highlighted by the comment in the output.

Screenshot 2025-01-09 at 14 41 10

This is a screenshot of the trace for the first job, highlighting the duration of the checkout phase consuming most of the time (as the command is super basic), and having a length of just under 7 seconds.

Screenshot 2025-01-09 at 14 41 16

This is a screenshot of the trace for the second job, where the spans included are the same as the first except the timing is drastically reduced without the overhead of the metadata API call(s).

These traces show a rough reduction of 2.5x for the checkout alone.

Testing

  • Tests have run locally (with go test ./...). Buildkite employees may check this if the pipeline has run automatically.
  • Code is formatted (with go fmt ./...)

@CerealBoy CerealBoy requested review from wolfeidau and moskyb January 9, 2025 03:49
Copy link
Contributor

@patrobinson patrobinson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you test what happens when you trigger a build for a specific SHA?
In that case we would BUILDKITE_COMMIT gets populated automatically, so then the metadata won't get updated... I wonder if that has any flow on effects?

@CerealBoy
Copy link
Contributor Author

Can you test what happens when you trigger a build for a specific SHA?

I really should've validated that when I was in the middle of doing the testing, will do so and report back any findings.

In that case we would BUILDKITE_COMMIT gets populated automatically, so then the metadata won't get updated... I wonder if that has any flow on effects?

This is part of why I opened the PR, aiming to prompt conversation about this (and any other topics that I may simply not know). Will also make note to validate what happens from that perspective, if anything 🤞🏼

@CerealBoy CerealBoy force-pushed the chore/skip-git-meta-data branch 2 times, most recently from 36c0e57 to 6becf2e Compare January 10, 2025 00:13
@CerealBoy CerealBoy force-pushed the chore/skip-git-meta-data branch from 6becf2e to d0a8f1d Compare January 19, 2025 21:27
Copy link
Contributor

@DrJosh9000 DrJosh9000 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall LGTM, two optional nits

@DrJosh9000
Copy link
Contributor

Assuming the testing works out.

@CerealBoy
Copy link
Contributor Author

Assuming the testing works out.

Good point @DrJosh9000, I forgot to share my results here.

Short answer is that everything works fine, the meta-data isn't necessary to be set for builds / steps / jobs to succeed. I didn't try anything terribly complex though, so there's perhaps nuances I didn't find. For the most part where it's needed BUIDLKITE_COMMIT is used as the reference.

I did find 1 use internally, though I think with the current logic to set the meta data when BUILDKITE_COMMIT doesn't match will still satisfy this use. I need to raise this with the team to get further feedback on the viability of the change.

@CerealBoy CerealBoy force-pushed the chore/skip-git-meta-data branch from b1a077d to e4d87ec Compare February 12, 2025 21:39
Copy link
Member

@pda pda left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rather than regexp, could we use git rev-parse to see if BUILDKITE_COMMIT is already resolved? 🤔

Here's a PoC in shell:

$ check() { if [[ $(git rev-parse "$1") == $1 ]]; then echo "already resolved: $1"; else echo "$1 -> $(git rev-parse "$1")"; fi }

$ check c1bb97fb56f448b91fb913d9f9e4d304607191b0
already resolved: c1bb97fb56f448b91fb913d9f9e4d304607191b0

$ check HEAD
HEAD -> c1bb97fb56f448b91fb913d9f9e4d304607191b0

@sj26
Copy link
Member

sj26 commented Feb 24, 2025

Can we add a test?

chore/skip-git-meta-data...chore/skip-git-meta-data-test

I notice some of the other tests have been modified too in a way that looks unrelated - what's going on there?

@CerealBoy
Copy link
Contributor Author

Can we add a test?

chore/skip-git-meta-data...chore/skip-git-meta-data-test

I notice some of the other tests have been modified too in a way that looks unrelated - what's going on there?

The existing tests tracked the progress of the various git commands that are executed, and removing the generating of the meta-data call means there's 1 less git call being made. I'll improve the tests though to cater for the differing cases, with and without the ENV being set.

})

agent := tester.MockAgent(t)
agent.Expect("meta-data", "exists", job.CommitMetadataKey).Exactly(0)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don’t we expect that this won’t be called here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that's why there's .Exactly(0) to specifically highlight it. I could remove these 2 lines completely and it would work as we expect, however I wanted to leave it here with the 0 to be pedantic about it.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh I missed the end bit! Nice.

Copy link
Member

@sj26 sj26 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice 🙌🏻

@CerealBoy CerealBoy changed the title Skip pushing the git commit metadata if BUILDKITE_COMMIT is set Skip pushing the git commit metadata if BUILDKITE_COMMIT_RESOLVED=true Feb 25, 2025
@CerealBoy CerealBoy force-pushed the chore/skip-git-meta-data branch from ea3ec9d to 4039cba Compare February 25, 2025 02:19
@CerealBoy CerealBoy merged commit d6317d6 into main Feb 25, 2025
1 check passed
@CerealBoy CerealBoy deleted the chore/skip-git-meta-data branch February 25, 2025 02:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants