-
Notifications
You must be signed in to change notification settings - Fork 64
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Flaky Test: JobSet Depends on #786
Comments
Just confirmation for prioritizing. |
Yea, let's call it that for now. Just want to make sure there isn't anything wrong with the functionality. |
I was just running the same test locally and it was working fine for me:
@kannon92 Did you see the same errors in other PRs ? |
You can see the test grid. It failed on the helm chart PR twice this week. |
As I checked these test cases local, I faced the different JobSet when DependsOn is enabled on JobSet trainer-node Job depends on launcher Job ready status
/Users/s14554/go/src/sigs.k8s.io/jobset/test/e2e/e2e_test.go:345
STEP: Create a JobSet with DependsOn @ 02/23/25 04:49:20.757
STEP: Verify that only Launcher is created @ 02/23/25 04:49:20.766
STEP: Wait for Launcher to be in Ready status @ 02/23/25 04:49:20.769
STEP: Verify that Launcher and Trainer Job is created @ 02/23/25 04:49:32.636
STEP: Wait for JobSet to be Completed @ 02/23/25 04:49:32.64
STEP: checking jobset status is: Completed @ 02/23/25 04:49:32.641
[TIMEDOUT] in [It] - /Users/s14554/go/src/sigs.k8s.io/jobset/test/e2e/e2e_test.go:345 @ 02/23/25 04:49:39.128
• [TIMEDOUT] [18.387 seconds]
JobSet when DependsOn is enabled on JobSet [It] trainer-node Job depends on launcher Job ready status
/Users/s14554/go/src/sigs.k8s.io/jobset/test/e2e/e2e_test.go:345
[TIMEDOUT] A suite timeout occurred
In [It] at: /Users/s14554/go/src/sigs.k8s.io/jobset/test/e2e/e2e_test.go:345 @ 02/23/25 04:49:39.128
This is the Progress Report generated when the suite timeout occurred:
JobSet when DependsOn is enabled on JobSet trainer-node Job depends on launcher Job ready status (Spec Runtime: 18.375s)
/Users/s14554/go/src/sigs.k8s.io/jobset/test/e2e/e2e_test.go:345
In [It] (Node Runtime: 18.371s)
/Users/s14554/go/src/sigs.k8s.io/jobset/test/e2e/e2e_test.go:345
At [By Step] checking jobset status is: Completed (Step Runtime: 6.487s)
/Users/s14554/go/src/sigs.k8s.io/jobset/test/util/util.go:78
Spec Goroutine
goroutine 83 [select]
github.com/onsi/gomega/internal.(*AsyncAssertion).match(0x140002651f0, {0x103923ac8, 0x14000807820}, 0x1, {0x0, 0x0, 0x0})
/Users/s14554/go/pkg/mod/github.com/onsi/[email protected]/internal/async_assertion.go:546
github.com/onsi/gomega/internal.(*AsyncAssertion).Should(0x140002651f0, {0x103923ac8, 0x14000807820}, {0x0, 0x0, 0x0})
/Users/s14554/go/pkg/mod/github.com/onsi/[email protected]/internal/async_assertion.go:145
> sigs.k8s.io/jobset/test/util.JobSetCompleted({0x103932590, 0x1045f6de0}, {0x10393adc0, 0x1400003b0e0}, 0x14000502540, 0x8bb2c97000)
/Users/s14554/go/src/sigs.k8s.io/jobset/test/util/util.go:86
| }
| terminalState := string(jobset.JobSetCompleted)
> gomega.Eventually(checkJobSetStatus, timeout, interval).WithArguments(ctx, k8sClient, js, conditions).Should(gomega.Equal(true))
| gomega.Eventually(checkJobSetTerminalState, timeout, interval).WithArguments(ctx, k8sClient, js, terminalState).Should(gomega.Equal(true))
| }
> sigs.k8s.io/jobset/test/e2e.init.func1.8.2.5()
/Users/s14554/go/src/sigs.k8s.io/jobset/test/e2e/e2e_test.go:407
|
| ginkgo.By("Wait for JobSet to be Completed", func() {
> util.JobSetCompleted(ctx, k8sClient, jobSet, timeout)
| })
| })
github.com/onsi/ginkgo/v2/internal.(*Suite).By(0x140001b6a88, {0x10338a6a3, 0x1f}, {0x1400051df48, 0x1, 0x10336279c?})
/Users/s14554/go/pkg/mod/github.com/onsi/ginkgo/[email protected]/internal/suite.go:323
github.com/onsi/ginkgo/v2.By({0x10338a6a3?, 0xc?}, {0x1400051df48?, 0x3?, 0x0?})
/Users/s14554/go/pkg/mod/github.com/onsi/ginkgo/[email protected]/core_dsl.go:600
> sigs.k8s.io/jobset/test/e2e.init.func1.8.2()
/Users/s14554/go/src/sigs.k8s.io/jobset/test/e2e/e2e_test.go:406
| })
|
> ginkgo.By("Wait for JobSet to be Completed", func() {
| util.JobSetCompleted(ctx, k8sClient, jobSet, timeout)
| })
github.com/onsi/ginkgo/v2/internal.extractBodyFunction.func3({0x14000622d80?, 0x0?})
/Users/s14554/go/pkg/mod/github.com/onsi/ginkgo/[email protected]/internal/node.go:475
github.com/onsi/ginkgo/v2/internal.(*Suite).runNode.func3()
/Users/s14554/go/pkg/mod/github.com/onsi/ginkgo/[email protected]/internal/suite.go:894
github.com/onsi/ginkgo/v2/internal.(*Suite).runNode in goroutine 9
/Users/s14554/go/pkg/mod/github.com/onsi/ginkgo/[email protected]/internal/suite.go:881
------------------------------
[ReportAfterSuite] Autogenerated ReportAfterSuite for --junit-report
autogenerated by Ginkgo
[ReportAfterSuite] PASSED [0.002 seconds]
------------------------------
Summarizing 1 Failure:
[TIMEDOUT] JobSet when DependsOn is enabled on JobSet [It] trainer-node Job depends on launcher Job ready status
/Users/s14554/go/src/sigs.k8s.io/jobset/test/e2e/e2e_test.go:345
Ran 1 of 7 Specs in 18.411 seconds
FAIL! - Suite Timeout Elapsed -- 0 Passed | 1 Failed | 0 Pending | 6 Skipped
--- FAIL: TestAPIs (18.42s)
FAIL
You're using deprecated Ginkgo functionality:
=============================================
--ginkgo.slow-spec-threshold is deprecated --slow-spec-threshold has been deprecated and will be removed in a future version of Ginkgo. This feature has proved to be more noisy than useful. You can use --poll-progress-after, instead, to get more actionable feedback about potentially slow specs and understand where they might be getting stuck.
To silence deprecations that can be silenced set the following environment variable:
ACK_GINKGO_DEPRECATIONS=2.22.2
Tests failed on attempt #115 |
Could it be related to the @ahg-g comment here: #740 (comment) ? |
https://testgrid.k8s.io/sig-apps#pull-jobset-test-e2e-main-1-30
I've seen this failure twice on the helm chart PR.
cc @andreyvelich @tenzen-y
The text was updated successfully, but these errors were encountered: