Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Request for Backport of Bug Fix for EBS CSI Driver Taint Issue to Karpenter v0.37.X #1900

Open
prad9192 opened this issue Jan 3, 2025 · 3 comments
Labels
kind/bug Categorizes issue or PR as related to a bug. triage/needs-information Indicates an issue needs more information in order to work on it.

Comments

@prad9192
Copy link

prad9192 commented Jan 3, 2025

Description

We have encountered a bug related to the EBS CSI Driver taint being incorrectly added back by Karpenter after the driver has already removed the taint.

  • Current Behavior: The taint from the EBS CSI Driver is being added back by Karpenter, causing unexpected behavior.
  • Expected Behavior: Karpenter should not add back the taint once the driver removes it.

This bug has already been addressed and fixed in the following PRs:

However, this fix was included starting from Karpenter version v1.0.4.


Impact

We are currently running Karpenter version v0.37.6 and have started encountering this issue in one of our clusters.

While we are in the process of upgrading to version v1 following the upgrade guide, which recommends upgrading to the latest patch within the same minor version, the issue persists in v0.37.x.

Issue details - aws/containers-roadmap#2470


Request

We kindly request a backport of the aforementioned patch to the v0.37.x release series.

This would ensure that users following the upgrade process can avoid encountering this issue and experience a smoother transition to version v1.


Additional Information

  • Karpenter Version: v0.37.6

Thank you for considering this request. Let us know if further details or testing assistance is required.

@prad9192 prad9192 added the kind/bug Categorizes issue or PR as related to a bug. label Jan 3, 2025
@k8s-ci-robot k8s-ci-robot added the needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. label Jan 3, 2025
@prad9192 prad9192 changed the title Request for Backport of Bug Fix for EBS CSI Driver Taint Issue to Karpenter v0.37.x Request for Backport of Bug Fix for EBS CSI Driver Taint Issue to Karpenter v0.37.6 Jan 3, 2025
@prad9192 prad9192 changed the title Request for Backport of Bug Fix for EBS CSI Driver Taint Issue to Karpenter v0.37.6 Request for Backport of Bug Fix for EBS CSI Driver Taint Issue to Karpenter v0.37.X Jan 3, 2025
@njtran
Copy link
Contributor

njtran commented Jan 4, 2025

If you're already upgrading to v1 and you see it's been backported to v1.0.x, is there a reason you can't upgrade to that version right now?

@prad9192
Copy link
Author

prad9192 commented Jan 4, 2025

That's a good point. We have a custom controller that serves as an interface between developers and Karpenter resources.

At the moment, the controller's Go module is built on Karpenter version 0.35.5, so it will be interesting to test its compatibility with v1. Currently, it works with Karpenter version 0.37.6.

It's a bit more complicated than simply upgrading from 0.35.2 to 0.37.6 and then to v1. This is because we're managing Karpenter deployments using Helm templates by leveraging the manifests within cluster resource sets.

We are currently handling the upgrade in three stages:

  1. Upgrade Karpenter to version 0.37.6.
  2. Update the custom controller to reference Karpenter's v1 module.
  3. Once the above changes are rolled out across all tiers, proceed with upgrading Karpenter to v1.

For more details, I have an ongoing discussion here for reference: GitHub Discussion #1870.

Regarding your original query, I'll get back to you after conducting some tests in the lower tiers.

@jonathan-innis
Copy link
Member

/triage needs-information

@k8s-ci-robot k8s-ci-robot added triage/needs-information Indicates an issue needs more information in order to work on it. and removed needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. labels Jan 13, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Categorizes issue or PR as related to a bug. triage/needs-information Indicates an issue needs more information in order to work on it.
Projects
None yet
Development

No branches or pull requests

4 participants