Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cluster AutoScaler scaling down the ondemand nodegroups's node without any reason #7451

Open
AmanPathak-DevOps opened this issue Nov 1, 2024 · 1 comment
Labels
area/cluster-autoscaler kind/bug Categorizes issue or PR as related to a bug.

Comments

@AmanPathak-DevOps
Copy link

Which component are you using?:

What version of the component are you using?:

Component version:

What k8s version are you using (kubectl version)?:

kubectl version Output
$ kubectl version
Client Version: v1.30.1
Kustomize Version: v5.0.4-0.*********
Server Version: v1.30.4-eks-a737599

What environment is this in?:
It's in Dev environment

We are using AWS cloud
What did you expect to happen?:

So, basically, I am using Cluster AutoScaler to autoscale the nodes in two node groups(on-demand ng and spot ng). I have implemented NTH and Priority Expander to give preference to Spot instances. As Spot instance can be down due to the bidding system, we have NTH for that. However, I cannot figure out why the on-demand node is getting down and going in Unknown status for more than 6-8 hours. Also, if the on-demand node is going down, CA should create a new one, but it's not doing so and taking more than 5-6 hours.
What happened instead?:

I am expecting that CA scale up and scale down the nodes should be in some time e.g. 5-10 minutes. but the on-demand instances are going down without any reason and CA creating on-demand nodes which are taking more than 5-6 hours. As the node is in Unknown status, all the pods that are running on the on-demand node are in the Terminating instance for more than 4-5 hours which is very frustrating as I am facing downtime due to RollUpdate Strategy.

How to reproduce it (as minimally and precisely as possible):

Here is the script of deployment and Kubernetes component that I am using for the CA

---
apiVersion: v1
kind: ServiceAccount
metadata:
  labels:
    k8s-addon: cluster-autoscaler.addons.k8s.io
    k8s-app: cluster-autoscaler
  annotations:
    eks.amazonaws.com/role-arn: arn:aws:iam::<Role-ARN>
  name: cluster-autoscaler
  namespace: kube-system

---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  name: cluster-autoscaler
  labels:
    k8s-addon: cluster-autoscaler.addons.k8s.io
    k8s-app: cluster-autoscaler
rules:
  - apiGroups: [""]
    resources: ["events", "endpoints"]
    verbs: ["create", "patch"]
  - apiGroups: [""]
    resources: ["pods/eviction"]
    verbs: ["create"]
  - apiGroups: [""]
    resources: ["pods/status"]
    verbs: ["update"]
  - apiGroups: [""]
    resources: ["endpoints"]
    resourceNames: ["cluster-autoscaler"]
    verbs: ["get", "update"]
  - apiGroups: [""]
    resources: ["nodes"]
    verbs: ["watch", "list", "get", "update"]
  - apiGroups: [""]
    resources:
      - "namespaces"
      - "pods"
      - "services"
      - "replicationcontrollers"
      - "persistentvolumeclaims"
      - "persistentvolumes"
    verbs: ["watch", "list", "get"]
  - apiGroups: ["extensions"]
    resources: ["replicasets", "daemonsets"]
    verbs: ["watch", "list", "get"]
  - apiGroups: ["policy"]
    resources: ["poddisruptionbudgets"]
    verbs: ["watch", "list"]
  - apiGroups: ["apps"]
    resources: ["statefulsets", "replicasets", "daemonsets"]
    verbs: ["watch", "list", "get"]
  - apiGroups: ["storage.k8s.io"]
    resources: ["storageclasses", "csinodes", "csidrivers", "csistoragecapacities"]
    verbs: ["watch", "list", "get"]
  - apiGroups: ["batch", "extensions"]
    resources: ["jobs"]
    verbs: ["get", "list", "watch", "patch"]
  - apiGroups: ["coordination.k8s.io"]
    resources: ["leases"]
    verbs: ["create"]
  - apiGroups: ["coordination.k8s.io"]
    resourceNames: ["cluster-autoscaler"]
    resources: ["leases"]
    verbs: ["get", "update"]
---
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
  name: cluster-autoscaler
  namespace: kube-system
  labels:
    k8s-addon: cluster-autoscaler.addons.k8s.io
    k8s-app: cluster-autoscaler
rules:
  - apiGroups: [""]
    resources: ["configmaps"]
    verbs: ["create", "list", "watch"]
  - apiGroups: [""]
    resources: ["configmaps"]
    resourceNames: ["cluster-autoscaler-status", "cluster-autoscaler-priority-expander"]
    verbs: ["delete", "get", "update", "watch"]

---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
  name: cluster-autoscaler
  labels:
    k8s-addon: cluster-autoscaler.addons.k8s.io
    k8s-app: cluster-autoscaler
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: cluster-autoscaler
subjects:
  - kind: ServiceAccount
    name: cluster-autoscaler
    namespace: kube-system

---
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
  name: cluster-autoscaler
  namespace: kube-system
  labels:
    k8s-addon: cluster-autoscaler.addons.k8s.io
    k8s-app: cluster-autoscaler
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: Role
  name: cluster-autoscaler
subjects:
  - kind: ServiceAccount
    name: cluster-autoscaler
    namespace: kube-system

---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: cluster-autoscaler
  namespace: kube-system
  labels:
    app: cluster-autoscaler
spec:
  replicas: 1
  selector:
    matchLabels:
      app: cluster-autoscaler
  template:
    metadata:
      labels:
        app: cluster-autoscaler
      annotations:
        prometheus.io/scrape: 'true'
        prometheus.io/port: '8085'
    spec:
      priorityClassName: system-cluster-critical
      securityContext:
        runAsNonRoot: true
        runAsUser: 65534
        fsGroup: 65534
        seccompProfile:
          type: RuntimeDefault
      serviceAccountName: cluster-autoscaler
      containers:
        - image: registry.k8s.io/autoscaling/cluster-autoscaler:v1.31.0
          name: cluster-autoscaler
          resources:
            limits:
              cpu: 100m
              memory: 600Mi
            requests:
              cpu: 100m
              memory: 600Mi
          command: 
            - ./cluster-autoscaler
            - --v=4
            - --stderrthreshold=info
            - --cloud-provider=aws
            - --skip-nodes-with-local-storage=false
            - --expander=priority
            - --node-group-auto-discovery=asg:tag=k8s.io/cluster-autoscaler/enabled,k8s.io/cluster-autoscaler/<cluster-name>
          volumeMounts:
            - name: ssl-certs
              mountPath: /etc/ssl/certs/ca-certificates.crt
              readOnly: true
          imagePullPolicy: "Always"
          securityContext:
            allowPrivilegeEscalation: false
            capabilities:
              drop:
                - ALL
            readOnlyRootFilesystem: true
      volumes:
        - name: ssl-certs
          hostPath:
            path: "/etc/ssl/certs/ca-bundle.crt"

Anything else we need to know?:

I1101 04:02:57.320439 1 aws_manager.go:188] Found multiple availability zones for ASG "-e2c94f70-6b1c-a9af-47eb-fea9a5915955"; using ap-south-1c for failure-domain.beta.kubernetes.io/zone label
I1101 04:02:57.320612 1 filter_out_schedulable.go:66] Filtering out schedulables
I1101 04:02:57.320714 1 klogx.go:87] failed to find place for logging/fluentd-jswwh: cannot put pod fluentd-jswwh on any node
I1101 04:02:57.320729 1 filter_out_schedulable.go:123] 0 pods marked as unschedulable can be scheduled.
I1101 04:02:57.320738 1 filter_out_schedulable.go:86] No schedulable pods
I1101 04:02:57.320743 1 filter_out_daemon_sets.go:40] Filtering out daemon set pods
I1101 04:02:57.320748 1 filter_out_daemon_sets.go:49] Filtered out 1 daemon set pods, 0 unschedulable pods left
I1101 04:02:57.320766 1 static_autoscaler.go:557] No unschedulable pods
I1101 04:02:57.320797 1 static_autoscaler.go:580] Calculating unneeded nodes
I1101 04:02:57.320812 1 pre_filtering_processor.go:67] Skipping ip-10-1-137-190.ap-south-1.compute.internal - node group min size reached (current: 1, min: 1)
I1101 04:02:57.320898 1 eligibility.go:104] Scale-down calculation: ignoring 5 nodes unremovable in the last 5m0s
I1101 04:02:57.320940 1 static_autoscaler.go:623] Scale down status: lastScaleUpTime=2024-11-01 03:42:50.431875787 +0000 UTC m=+127994.930106250 lastScaleDownDeleteTime=2024-10-31 06:18:29.370821589 +0000 UTC m=+50933.869052042 lastScaleDownFailTime=2024-10-30 15:09:57.022381669 +0000 UTC m=-3578.479387878 scaleDownForbidden=false scaleDownInCooldown=false
I1101 04:02:57.320969 1 static_autoscaler.go:644] Starting scale down

Node Status

ip-10-1-137-190.ap-south-1.compute.internal NotReady 23h v1.30.4-eks-a737599

@AmanPathak-DevOps AmanPathak-DevOps added the kind/bug Categorizes issue or PR as related to a bug. label Nov 1, 2024
@AmanPathak-DevOps AmanPathak-DevOps changed the title Cluster AutoScaler scaling down the ondemand nodegroups without any reason Cluster AutoScaler scaling down the ondemand nodegroups's node without any reason Nov 1, 2024
@adrianmoisey
Copy link
Member

/area cluster-autoscaler

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/cluster-autoscaler kind/bug Categorizes issue or PR as related to a bug.
Projects
None yet
Development

No branches or pull requests

3 participants