Skip to content
This repository has been archived by the owner on Sep 19, 2022. It is now read-only.

Support gang-scheduling by kube-batch #129

Closed
k82cn opened this issue Jan 29, 2019 · 4 comments
Closed

Support gang-scheduling by kube-batch #129

k82cn opened this issue Jan 29, 2019 · 4 comments

Comments

@k82cn
Copy link

k82cn commented Jan 29, 2019

Gang-scheduling is a common requirement from training job; kube-batch supports it right now :) So open this issue to trace the discussion.

@k82cn
Copy link
Author

k82cn commented Jan 29, 2019

/kind feature

@k82cn
Copy link
Author

k82cn commented Feb 14, 2019

@johnugeorge , what's the plan of 0.5?

@johnugeorge
Copy link
Member

@k82cn
Currently, gang scheduling behavior is consistent across TF and Pytorch operators.

Are you working on kubeflow/training-operator#916? This is planned for 0.5 release across operators.

@k82cn
Copy link
Author

k82cn commented Mar 27, 2019

regarding kubeflow/training-operator#916 , replace PDB with PodGroup is done; I'm thinking how to add some other advanced feature from kube-batch by PodGroup :). I'll do some investigation after 0.5 :)

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

2 participants