-
Notifications
You must be signed in to change notification settings - Fork 219
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Unexpected nodeclaim consolidation #1895
Comments
Can you provide the broader context around how you are able to produce this kind of behavior? The code link alone that you shared doesn't indicate to me that there is a bug. What exact inputs and outputs trigger this? |
/triage needs-information |
If all instance types in all existed nodepools are soldout. Existed nodeclaim was deleted abnormally by consolidation even if the pods on this node cannot be rescheduled to other nodes. In this case,
will return nil(errs is nil ) because s.nodeClaimTemplates is empty. Return nil means pod (on existing node) will be misjudged as being reschedulable to other nodes/nodeclaims/newnodeclaim. And then all pods of this existing node are misjudged as being reschedulable to other nodes/nodeclaims/newnodeclaim, finally this node will be deleted.
I think maybe Like:
|
Observed Behavior:
If all instance types of nodepools are unavailable(s.nodeClaimTemplates is empty), an existing nodeclaim will be misjudged as being consolidated. Because all pods on this node were incorrectly determined to be reschedulable to other nodes.
karpenter/pkg/controllers/provisioning/scheduling/scheduler.go
Line 288 in 1db5097
If s.nodeClaimTemplates is empty, it means that a new nodeClaim cannot be created and an error should be returned.
The text was updated successfully, but these errors were encountered: