Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reduce the impact of flaky tests to be able to make them more actionable #7055

Closed
2 of 3 tasks
joperezr opened this issue Jan 9, 2025 · 6 comments
Closed
2 of 3 tasks
Assignees
Labels
area-engineering-systems infrastructure helix infra engineering repo stuff flaky-test
Milestone

Comments

@joperezr
Copy link
Member

joperezr commented Jan 9, 2025

Objective: Improve the infrastructure to reduce the impact of flaky tests and make them more actionable.

Tasks:

  • Consider Isolating flaky tests into individual jobs to be able to re-run just those when they fail.
  • Implement mechanisms to quickly identify and disable failing tests.
  • Ensure that test failures are actionable with proper logs and diagnostics. (e.g. Once a test fails, we have enough information to investigate and fix them)
@joperezr joperezr added this to the 9.1 milestone Jan 9, 2025
@joperezr joperezr added the area-engineering-systems infrastructure helix infra engineering repo stuff label Jan 9, 2025
@sebastienros
Copy link
Member

Candidate for quarantining: AzureServiceBusExtensionsTests.VerifyWaitForOnServiceBusEmulatorBlocksDependentResources

Examples:

@eerhardt
Copy link
Member

eerhardt commented Jan 10, 2025

I've disabled the following in Disable Azure ServiceBus emulator functional tests (dotnet/aspire#7067):

  • AzureServiceBusExtensionsTests.VerifyWaitForOnServiceBusEmulatorBlocksDependentResources
  • AzureServiceBusExtensionsTests.VerifyAzureServiceBusEmulatorResource
  • The ServiceBus portion of the Azure Functions Playground test

@davidfowl
Copy link
Member

Why do we have this #7056

@danmoseley
Copy link
Member

danmoseley commented Jan 20, 2025

Why do we have this #7056

The assumption was there was test specific work to fix flaky tests, and infra work to reduce the impact when tests were actually flaky. But I guess everyone's using just this issue

@JamesNK JamesNK assigned davidfowl and unassigned JamesNK Jan 20, 2025
@davidfowl
Copy link
Member

I think we're done with this for 9.1.

We don't have a way to quarantine test run as yet. That would be the last thing I think we could do. Run all tests marked with ActiveIssue in a separate workflow that could be used to investigate issues. I was thinking about a manually triggerable workflow to run a specific test project.

@davidfowl
Copy link
Member

Closing this issue out.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area-engineering-systems infrastructure helix infra engineering repo stuff flaky-test
Projects
None yet
Development

No branches or pull requests

6 participants