Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support JAX in Kueue using training-operator 1.9 #4073

Open
mimowo opened this issue Jan 28, 2025 · 5 comments
Open

Support JAX in Kueue using training-operator 1.9 #4073

mimowo opened this issue Jan 28, 2025 · 5 comments
Labels
kind/feature Categorizes issue or PR as related to a new feature.

Comments

@mimowo
Copy link
Contributor

mimowo commented Jan 28, 2025

What would you like to be added:

Support for the recently added JAX in the training-operator, see kubeflow/training-operator#1619 and kubeflow/training-operator#1619.

Why is this needed:

For completeness of the support in Kueue. Also, the support for JAX was long-awaited in the training-operator, and there will be a community of people willing to use it with Kueue.

@mimowo mimowo added the kind/feature Categorizes issue or PR as related to a new feature. label Jan 28, 2025
@mimowo
Copy link
Contributor Author

mimowo commented Jan 28, 2025

cc @mbobrovskyi @mszadkow
Requires #4066

@tenzen-y
Copy link
Member

I'm fine with supporting JAXJob in the Kueue!

Just FYI: we did not confirm if JAXJob is compatible with TPU 😞
Are you ok with such a JaxJob support level?

@mimowo
Copy link
Contributor Author

mimowo commented Jan 28, 2025

IIUC, you haven't tested, but it does not indicate it is not working? I don't expect much of a difference between GPU and TPU, but without testing it remains unknown.

In any case I would be ok to claim the support for the GPUs.

@tenzen-y
Copy link
Member

tenzen-y commented Jan 28, 2025

IIUC, you haven't tested, but it does not indicate it is not working? I don't expect much of a difference between GPU and TPU, but without testing it remains unknown.

In any case I would be ok to claim the support for the GPUs.

Actually, we did not have any verifications for both GPU and TPU. We verified only CPU

@mimowo
Copy link
Contributor Author

mimowo commented Jan 28, 2025

I see, getting some confirmation that the controller works with GPU or TPU would be great.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/feature Categorizes issue or PR as related to a new feature.
Projects
None yet
Development

No branches or pull requests

2 participants