PERF-5947: Move Query-related tests to replica-* variants instead of standalones and single-replicas #1257

wqian94 · 2024-08-28T19:11:12Z

Jira Ticket: PERF-5947

Whats Changed

Replace standalone-classic-query-engine, standalone-sbe, and the corresponding single-replica variants with the latest replica-query-engine-classic and replica-query-engine-sbe variants, respectively. Also clean up other Query-related tests to move away from standalones and single-replicas where possible.

Patch Testing Results

Patch.

…of standalones and single-replicas

alicedoherty

Thanks @wqian94. This is changing a lot of tasks. From the perf perspective this looks good (and is what we want) but if possible can we get someone from query to also review this to make sure you're happy to shift all testing to replica sets?

Also there is no variant that runs with replica-80-feature-flag so I think you can remove all references to that.

src/workloads/execution/MultiPlanning.yml

alicedoherty · 2024-08-29T09:07:41Z

src/workloads/query/AggregateExpressions.yml

-          - standalone-all-feature-flags
-          - standalone-classic-query-engine
-          - standalone-sbe
+          - replica-80-feature-flags


I don't think we currently have any variants in sys-perf that run with replica-80-feature-flags

I saw a bunch of tests with replica-80-feature-flags in the repo already. Do you know what's going on with those?

I'm not sure - the only PR I could find related to it was https://github.com/10gen/dsi/pull/1637 but I couldn't find any that added/removed the variant and it wasn't removed as part of SERVER-89499. I can follow-up more on this if you think having a variant with the 8.0 FFs is necessary.

I think we do want an 8.0 FFs variant. Lemme ask around Query and see who might know more.

Now that we've committed to releasing 8.0 with the feature flags we ended up choosing, having an 8.0 feature flags variant isn't that useful. However, I'd advocate to leave the specifications intact for now and have the Product Performance team do the work to wholesale remove these and ensure that we have good substitutes intact.

I've created a ticket to track that effort here.

Thanks @BlakeIsBlake for opening that ticket. As Blake said I'm happy for you to leave tests that have replica-80-feature-flags as is, and when our team gets to PERF-5857 we can easily switch them to a 9.0/all FF variant. Just bear in mind, that it may be slightly confusing that the Genny AutoRuns are implying it's running on an 8.0 FF variant when it actually isn't.

src/workloads/query/ArrayTraversal.yml

src/workloads/query/BooleanSimplifier.yml

src/workloads/transactions/LLTMixed.yml

… variants

alicedoherty

Code changes look good but can we do some perf/noise analysis to validate that moving to replica sets still gives useful signal. Maybe you could use the variant to variant feature in the perf analyser to compare these tests x5 running on standalone vs x5 running on replica? (I think configuring and running all the FF, SBE, classic engine, etc. variants will be a pain so just the plain standalone and replica should be fine).

Also can we get someone from query on this PR?

wqian94 · 2024-09-03T15:03:44Z

Looks like SERVER-88741 is the ticket that added the -80-feature-flags variants we're seeing. I've asked Blake Oler, who originally committed these variants, whether we should keep or remove these variants from the variants lists.

wqian94 · 2024-09-03T15:06:22Z

Also, @alicedoherty the variant-to-variant comparisons seemed to spawn an excessive number of tasks. Since many of them failed, the perf analyzer is unable to create a comparison. I'm going to retry this week with fewer tasks and hopefully that's enough to convince us that the CoVs are acceptable.

wqian94 · 2024-09-03T19:09:15Z

@BlakeIsBlake this is the PR in question. Do you think we should leave the *-80-feature-flags in to make it easier to replace come 9.0, or is it better to remove them now and add them in later?

wqian94 · 2024-09-13T16:17:14Z

Just a quick update, since I neglected to do so earlier: I'm rerunning the v2v analysis, now that a crash with the v2v analyses I was doing has been fixed.

wqian94 · 2024-09-13T20:28:09Z

Analysis complete for classic. CoV seems slightly high for some tests, though?

wqian94 · 2024-09-13T20:29:24Z

I don't think the SBE variant is available on master right now, so I can either compare older checkouts or just let it be for now.

alicedoherty · 2024-09-16T10:38:52Z

Analysis complete for classic. CoV seems slightly high for some tests, though?

Thank you! The tests with very high CV on the 3-node replica sets look to be already very noisy on standalones. Although the percentages seem to have gone up for some, clicking into the actual values makes it look like the values are in the range of noise previously seen on standalones so I'm not too worried about them.

I don't think the SBE variant is available on master right now, so I can either compare older checkouts or just let it be for now.

We have SBE Standalone ARM AWS 2023-11 and Query Engine (SBE) 3-Node ReplSet ARM AWS 2023-11 running on master right now - what issues were you having when scheduling them?

I'd like to see a variant to variant analysis for the SBE 3-Node vs SBE Standalone (just like you did with Classic Engine) if possible. Thanks again @wqian94 for the analysis on this!

wqian94 · 2024-09-16T12:54:14Z

The old SBE variant doesn't seem to be available for v2v comparison last I checked. I'll look again today.

alicedoherty · 2024-09-20T15:36:21Z

I've generated a perf comparison to compare the performance of these tasks on standalone vs 3-node replica sets for SBE.

Moving from standalone to 3-node has a decent perf impact (as expected) on a lot of tests, but I'd like to spend more time going through these to be safe. Same goes for double checking the noise reasonable. I don't think this is something I will be able to do by EOD today.

If this change is not high priority right now I'd propose to assign DEVPROD-9461 to me and I'll take some time to properly sign off on these changes. I'll be OOO next week but can get back to this asap the week after.

dstorch · 2024-10-01T19:58:53Z

I'm removing myself from this PR. I think Alice's approval should be sufficient.

alicedoherty

Thanks @wqian94 for your patience with this. I'm happy for this switch to be made now.

For more context, looking at the results link above for most tasks this shouldn't have a big impact on performance (for the actual important metrics). Some tasks get noisier, some get less noisy but there were none that raised alarm bells for me. Also at the end of the day, the plan is for a prioritised subset of these Genny workloads to get ported to Locust, plus query will be defining their own "gates of performance" (DEVPROD-5362) so in general these workloads are in a state of flux.

Before you merge, can you please double check you've followed all the steps in the updated Genny docs here. Since this PR went up the way AutoRun tasks were generated has changed so I want to make sure these tasks will still be generated properly.

Additionally, can you please open a ticket/PR to remove the no longer used SBE and Classic engine variants from sys-perf? It could be worth waiting a week or two to remove them, in case there's any weirdness going on once this change goes in.

Thank you!

alicedoherty · 2024-10-15T13:39:53Z

Patch here

thessem

Only reviewing the files I have ownership over

alicedoherty

LGTM - I'm happy for this to be merged now

DEVPROD-9461: Move Query-related tests to replica-* variants instead …

fbae500

…of standalones and single-replicas

wqian94 requested a review from alicedoherty August 28, 2024 19:11

wqian94 requested a review from a team as a code owner August 28, 2024 19:11

alicedoherty requested changes Aug 29, 2024

View reviewed changes

Clean up some remaining standalone and single-replica SBE and classic…

73b22bb

… variants

wqian94 requested a review from alicedoherty August 29, 2024 18:48

alicedoherty reviewed Aug 30, 2024

View reviewed changes

wqian94 requested a review from dstorch September 3, 2024 14:53

BlakeIsBlake self-requested a review September 3, 2024 19:10

dstorch removed their request for review October 1, 2024 19:58

alicedoherty approved these changes Oct 2, 2024

View reviewed changes

wqian94 changed the title ~~DEVPROD-9461: Move Query-related tests to replica-* variants instead of standalones and single-replicas~~ PERF-5947: Move Query-related tests to replica-* variants instead of standalones and single-replicas Oct 14, 2024

wqian94 and others added 4 commits October 14, 2024 16:05

Merge conflict resolution

35b981c

Merge conflict resolution for non-autorun test

2a55730

Merge branch 'master' into william.qian/query-variants

f1bc1b7

auto generate genny tasks

8c0c734

alicedoherty requested a review from a team as a code owner October 15, 2024 13:33

thessem approved these changes Oct 16, 2024

View reviewed changes

alicedoherty approved these changes Oct 17, 2024

View reviewed changes

wqian94 added this pull request to the merge queue Oct 24, 2024

Merged via the queue into master with commit 7401c77 Oct 24, 2024
11 checks passed

wqian94 deleted the william.qian/query-variants branch October 24, 2024 17:30

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

PERF-5947: Move Query-related tests to replica-* variants instead of standalones and single-replicas #1257

PERF-5947: Move Query-related tests to replica-* variants instead of standalones and single-replicas #1257

wqian94 commented Aug 28, 2024 •

edited

Loading

alicedoherty left a comment

alicedoherty Aug 29, 2024

wqian94 Aug 29, 2024

alicedoherty Aug 30, 2024

wqian94 Aug 30, 2024 •

edited

Loading

BlakeIsBlake Sep 3, 2024

alicedoherty Sep 4, 2024

alicedoherty left a comment

wqian94 commented Sep 3, 2024

wqian94 commented Sep 3, 2024

wqian94 commented Sep 3, 2024

wqian94 commented Sep 13, 2024 •

edited

Loading

wqian94 commented Sep 13, 2024 •

edited

Loading

wqian94 commented Sep 13, 2024

alicedoherty commented Sep 16, 2024

wqian94 commented Sep 16, 2024

alicedoherty commented Sep 20, 2024

dstorch commented Oct 1, 2024

alicedoherty left a comment

alicedoherty commented Oct 15, 2024

thessem left a comment

alicedoherty left a comment

PERF-5947: Move Query-related tests to replica-* variants instead of standalones and single-replicas #1257

PERF-5947: Move Query-related tests to replica-* variants instead of standalones and single-replicas #1257

Conversation

wqian94 commented Aug 28, 2024 • edited Loading

Whats Changed

Patch Testing Results

alicedoherty left a comment

Choose a reason for hiding this comment

alicedoherty Aug 29, 2024

Choose a reason for hiding this comment

wqian94 Aug 29, 2024

Choose a reason for hiding this comment

alicedoherty Aug 30, 2024

Choose a reason for hiding this comment

wqian94 Aug 30, 2024 • edited Loading

Choose a reason for hiding this comment

BlakeIsBlake Sep 3, 2024

Choose a reason for hiding this comment

alicedoherty Sep 4, 2024

Choose a reason for hiding this comment

alicedoherty left a comment

Choose a reason for hiding this comment

wqian94 commented Sep 3, 2024

wqian94 commented Sep 3, 2024

wqian94 commented Sep 3, 2024

wqian94 commented Sep 13, 2024 • edited Loading

wqian94 commented Sep 13, 2024 • edited Loading

wqian94 commented Sep 13, 2024

alicedoherty commented Sep 16, 2024

wqian94 commented Sep 16, 2024

alicedoherty commented Sep 20, 2024

dstorch commented Oct 1, 2024

alicedoherty left a comment

Choose a reason for hiding this comment

alicedoherty commented Oct 15, 2024

thessem left a comment

Choose a reason for hiding this comment

alicedoherty left a comment

Choose a reason for hiding this comment

wqian94 commented Aug 28, 2024 •

edited

Loading

wqian94 Aug 30, 2024 •

edited

Loading

wqian94 commented Sep 13, 2024 •

edited

Loading

wqian94 commented Sep 13, 2024 •

edited

Loading