Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reduce the flaky tests impact by 90% #7056

Open
5 tasks
joperezr opened this issue Jan 9, 2025 · 2 comments
Open
5 tasks

Reduce the flaky tests impact by 90% #7056

joperezr opened this issue Jan 9, 2025 · 2 comments
Assignees
Labels
area-meta flaky-test tracking Tracking issue for some TODOs
Milestone

Comments

@joperezr
Copy link
Member

joperezr commented Jan 9, 2025

Objective: Systematically address and fix the flaky tests by 90%.

Tasks:

  • Pull metrics to identify the most problematic flaky tests.
  • Prioritize and fix the tests based on their impact.
  • Provide regular updates on progress and metrics.
  • Collect and report metrics such as time, number of failures, and successful jobs in order to be able to create a curve to visualize the progress in reducing flaky tests.
  • Create a curve to visualize the progress in reducing flaky tests.
@joperezr joperezr added flaky-test tracking Tracking issue for some TODOs labels Jan 9, 2025
@joperezr joperezr modified the milestone: 9.1 Jan 9, 2025
@joperezr joperezr changed the title Walk down through the flaky tests individually and fixing those Reduce the flaky tests impact by 90% Jan 9, 2025
@davidfowl
Copy link
Member

Looking at the stats there are some things that stand out:

https://dev.azure.com/dnceng-public/public/_test/analytics?definitionId=274&contextType=build

We should delete WithDataShouldPersistStateBetweenUsages tests in general (not just elastic search) or find a way to make them more reliable. These tests are super sketchy and do things like

DockerUtils.AttemptDeleteDockerVolume(volumeName, throwOnFailure: true);

Looking at all tests sorted by duration:

Image

The starter template should be a separate job and we should decide what to do with these > 2 minute tests. (PS notice the slow ones are mostly the hosting tests).

I tried messing with a github action designed to help isolate and investigate these tests (#7073).

Ideally, we can work in a way where we dont need an hour per run to diagnoses test failures by staring at lots of console output.

@davidfowl davidfowl assigned davidfowl and unassigned JamesNK Jan 15, 2025
@davidfowl
Copy link
Member

Taking this one

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area-meta flaky-test tracking Tracking issue for some TODOs
Projects
None yet
Development

No branches or pull requests

3 participants