Azure- returning in-memory size incorrect value when spot instance is deleted #7373

magnetic5355 · 2024-10-09T02:31:05Z

Which component are you using?:cluster-autoscaler

What version of the component are you using?: 1.31

Component version: 1.31

What k8s version are you using (kubectl version)?: 1.30.5+k3s1

kubectl version Output

$ kubectl version

What environment is this in?: Azure

What did you expect to happen?: When a VMSS spot instance is deleted and the node is removed from the cluster I expect the autoscaler to invalidate its cache

What happened instead?: Schedulable pods are present, however the in-memory size is 9 but the actual VMSS set is only 7

1 filter_out_schedulable.go:78] Schedulable pods present │
│ I1009 02:24:15.536067 1 static_autoscaler.go:557] No unschedulable pods │
│ I1009 02:24:15.536082 1 azure_scale_set.go:217] VMSS: k8-agent-2, returning in-memory size: 0 │
│ I1009 02:24:15.536093 1 azure_scale_set.go:217] VMSS: k8-agent-d2ds_v5, returning in-memory size: 9

--- eventually this will start logging in a loop when the cluster tries to scale down ----

│ I1009 02:31:59.254556 1 static_autoscaler.go:756] Decreasing size of k8-agent-d2ds_v5, expected=9 current=7 delta=-2 │
│ I1009 02:31:59.254570 1 azure_scale_set_instance_cache.go:77] invalidating instanceCache for k8-agent-d2ds_v5 │
│ I1009 02:31:59.254579 1 azure_scale_set.go:217] VMSS: k8-agent-d2ds_v5, returning in-memory size: 9 │
│ I1009 02:31:59.254594 1 static_autoscaler.go:469] Some node group target size was fixed, skipping the iteration

How to reproduce it (as minimally and precisely as possible):

Setup K3S cluster (not using AKS)
Set provider ID on nodes to proper format ie aks:///
Set kubernetes.azure.com/agentpool node label
Add tags to VMSS for auto scaler
Increase workload to have autoscaler create new nodes.
Delete a VMSS instance from Azure

In memory size never refreshes, new nodes are never created.

I have to restart the cluster-autoscaler pod to scale the cluster back up

Anything else we need to know?:

The text was updated successfully, but these errors were encountered:

adrianmoisey · 2024-10-09T11:05:18Z

/kind cluster-autoscaler

k8s-ci-robot · 2024-10-09T11:05:20Z

@adrianmoisey: The label(s) kind/cluster-autoscaler cannot be applied, because the repository doesn't have them.

In response to this:

/kind cluster-autoscaler

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

adrianmoisey · 2024-10-09T11:05:42Z

/area cluster-autoscaler

magnetic5355 added the kind/bug Categorizes issue or PR as related to a bug. label Oct 9, 2024

k8s-ci-robot added the area/cluster-autoscaler label Oct 9, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Azure- returning in-memory size incorrect value when spot instance is deleted #7373

Azure- returning in-memory size incorrect value when spot instance is deleted #7373

magnetic5355 commented Oct 9, 2024 •

edited

Loading

adrianmoisey commented Oct 9, 2024

k8s-ci-robot commented Oct 9, 2024

adrianmoisey commented Oct 9, 2024

Azure- returning in-memory size incorrect value when spot instance is deleted #7373

Azure- returning in-memory size incorrect value when spot instance is deleted #7373

Comments

magnetic5355 commented Oct 9, 2024 • edited Loading

adrianmoisey commented Oct 9, 2024

k8s-ci-robot commented Oct 9, 2024

adrianmoisey commented Oct 9, 2024

magnetic5355 commented Oct 9, 2024 •

edited

Loading