-
Notifications
You must be signed in to change notification settings - Fork 51
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Mutate and Validate RayCluster on SecurityContext #574
Mutate and Validate RayCluster on SecurityContext #574
Conversation
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: The full list of commands accepted by this bot can be found here.
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
1c53e5e
to
1ba3eb2
Compare
rayclusterlog.Error(err, "Failed to check namespace resource labels") | ||
return err | ||
} | ||
if hasSecurityLabels { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not adding the security context unconditionally?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I actually wasn't sure if that's what we wanted, hence made it conditional. On a security point a view, you're right, I think we should always have this securityContext enabled. Should I make it unconditional, and also add to the ValidatingWebhook
to ensure these are set?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I think that should be added unconditionally. I'm not sure how it'll behave on non OpenShift clusters. We may make it only for OpenShift if it has to.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm interesting point. I'll make the changes and check the e2e tests. This should be sufficient to confirm its behaviour. Thanks Antonin!
c00e58b
to
bd0aff8
Compare
d5ce875
to
20d8abf
Compare
I've run this on my cluster:
'Traceback (most recent call last):\n File "/tmp/ray/session_2024-06-21_04-50-03_677463_8/runtime_resources/working_dir_files/_ray_pkg_262c3c69a1596acc/mnist_fashion.py", line 84, in \n results = trainer.fit()\n File "/home/ray/anaconda3/lib/python3.9/site-packages/ray/train/base_trainer.py", line 633, in fit\n tuner = Tuner(\n File "/home/ray/anaconda3/lib/python3.9/site-packages/ray/tune/tuner.py", line 179, in init\n self._local_tuner = TunerInternal(**kwargs)\n File "/home/ray/anaconda3/lib/python3.9/site-packages/ray/tune/impl/tuner_internal.py", line 144, in init\n ) = self.setup_create_experiment_checkpoint_dir(\n File "/home/ray/anaconda3/lib/python3.9/site-packages/ray/tune/impl/tuner_internal.py", line 522, in setup_create_experiment_checkpoint_dir\n os.makedirs(experiment_path, exist_ok=True)\n File "/home/ray/anaconda3/lib/python3.9/os.py", line 215, in makedirs\n makedirs(head, exist_ok=exist_ok)\n File "/home/ray/anaconda3/lib/python3.9/os.py", line 225, in makedirs\n mkdir(name, mode)\nPermissionError: [Errno 13] Permission denied: '/home/ray/ray_results'\n' |
@ChristianZaccaria Should we close this then in lieu of opendatahub-io/kuberay#19 ? |
Closing in lieu of opendatahub-io/kuberay#19 |
Issue link
Jira: https://issues.redhat.com/browse/RHOAIENG-7638
What changes have been made
SecurityContext
in the RayCluster's head and worker containers.SecurityContext
in the RayCluster's head and worker containers.Verification steps
namespace
resource with value:Namespace
resource.SecurityContext
is added to both head and worker containers when the mentioned labels have been set.SecurityContext
, and attempt to update theSecurityContext
once the RayCluster is created. The ValidatingWebhook should reject these actions.Checks