Hard-coded service name precludes deployment of multiple operators #27

grzleadams · 2024-04-22T14:28:48Z

The serviceaccount name is not reliant on the release name so there can only be one per cluster. When trying to deploy a second operator instance to another namespace, you'll get an error like this:

Error: Unable to continue with install: ServiceAccount "pulp-operator-controller-manager" in namespace "pulp" exists and cannot be imported into the current release: invalid ownership metadata; annotation validation error: key "meta.helm.sh/release-name" must equal "pulp-test": current value is "pulp"; annotation validation error: key "meta.helm.sh/release-namespace" must equal "pulp-test": current value is "pulp"

The text was updated successfully, but these errors were encountered:

grzleadams · 2024-04-22T14:31:33Z

Obviously deploying the operator multiple times is not a typical workflow for operators since the service account should get cluster roles that allow it to deploy in multiple namespaces... this is just a case where I wanted to test a newer version of the operator without modifying the older one.

grzleadams · 2024-04-23T15:43:47Z

It looks like it's not possible to deploy a cluster-wide operator (the deployment sets WATCH_NAMESPACE) so it should be fairly straightforward to just append the namespace to the service account name. I'll submit a PR to do so.

git-hyagi · 2024-04-23T17:57:14Z

Hi @grzleadams,

Sorry for the delayed response!
To give a little bit of context, in the past, pulp-operator was cluster-scoped but after some discussion the team decided to change it to a "namespace-scoped" operator.
We did some tests and thought about "supporting" multiple installations of pulp (in different namespaces) using a single operator deployment and also thought about documenting the possibility of doing so, but we didn't extensively test it to consider a "supported feature".

So, in a high level thought about this issue, I think that we first need to address the multiple installations of pulp (in pulp-operator repo), then we can update the Chart with the necessary changes.
WDYT?

grzleadams · 2024-04-23T18:20:34Z

I'm ambivalent; I think there's a lot of value in providing the capability to deploy a cluster-scoped operator, but I also have significant concerns with the security and operational implications of such operators. Most operators/controllers that I've seen (GitHub's ARC, for example) default to cluster scope but can be restricted to a single namespace by the user, which is the approach that makes the most sense to me. The main problem with the multiple-controller deployment method is that all of the controllers in a cluster typically need to be upgraded simultaneously if there are CRD changes, which can be painful (and why I normally opt for a single, cluster-wide controller deployment).

git-hyagi · 2024-04-23T19:18:47Z

I think there's a lot of value in providing the capability to deploy a cluster-scoped operator, but I also have significant concerns with the security and operational implications of such operators.

We are on the same page . 👍

Most operators/controllers that I've seen (GitHub's ARC, for example) default to cluster scope but can be restricted to a single namespace by the user, which is the approach that makes the most sense to me.

I agree.

The main problem with the multiple-controller deployment method is that all of the controllers in a cluster typically need to be upgraded simultaneously if there are CRD changes, which can be painful (and why I normally opt for a single, cluster-wide controller deployment).

This is something that we also encountered in some internal deployments (and indeed, it was very painful to do the upgrades).
For a single cluster-wide controller, did you have any issue with performance or resource utilization? I'm asking this because, besides the upgrade issue when CRD changes, some additional concerns that we have (we will need to test) with a single cluster-wide controller:

we will need to verify the HA features of the controller (multiple replicas, leader election, etc) because, since it will be managing multiple Pulp instances, an outage would be more critical
the resource allocation for the controller pods (more instances of CRs will probably impact in the number of reconciliations, which will probably impact in the cpu/memory utilization)
multiple instances of Pulp in the same namespace and name clashes (possibly letting the controller in an infinite loop)

@dkliban, do you think that we should bring back this discussion of multiple Pulp installations in the next deployers meeting? Do you agree with the above statements?

dkliban · 2024-04-23T19:51:19Z

@git-hyagi We should consider supporting multiple Pulp deployments with one controller. I agree with the concerns your outline above. Though I think we could deliver this feature initially without HA support. Let's discuss next week.

grzleadams · 2024-04-24T01:22:24Z

For a single cluster-wide controller, did you have any issue with performance or resource utilization?

For Pulp specifically, or with operators in general? I wasn't using Pulp before that change to namespace-specific operator scope was made, so unfortunately I can't really comment on that performance. But for other operators, I haven't really had a problem with performance when they're watching the whole cluster. I will say the Pulp operator is somewhat chattier than others so it's unclear to me whether the reconciliation is actually using more cycles than others or if it's just louder about it (for example, Pulp operator does reconciliation when any metadata updates occur, and things like Rancher will update otherwise-useless metadata a lot).

grzleadams · 2024-04-24T01:27:06Z

And to your other two points re: cluster-scoped controllers:

we will need to verify the HA features of the controller (multiple replicas, leader election, etc) because, since it will be managing multiple Pulp instances, an outage would be more critical

I'm not convinced this is really all that important for the operator itself. If you're in a situation where your Pulp installation requires reconciliation and the operator is dead, you probably have bigger problems. If the operator goes down by itself, though, the Pulp deployments shouldn't be impacted.

multiple instances of Pulp in the same namespace and name clashes (possibly letting the controller in an infinite loop)

You can use the CR metadata as a key to avoid name collisions. The combination of metadata.name and metadata.namespace will always be unique.

git-hyagi · 2024-08-31T19:26:43Z

#32

romeroalx mentioned this issue Aug 29, 2024

Basic templating for k8s resources names #32

Merged

git-hyagi closed this as completed Aug 31, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Hard-coded service name precludes deployment of multiple operators #27

Hard-coded service name precludes deployment of multiple operators #27

grzleadams commented Apr 22, 2024

grzleadams commented Apr 22, 2024 •

edited

Loading

grzleadams commented Apr 23, 2024

git-hyagi commented Apr 23, 2024

grzleadams commented Apr 23, 2024

git-hyagi commented Apr 23, 2024

dkliban commented Apr 23, 2024

grzleadams commented Apr 24, 2024

grzleadams commented Apr 24, 2024 •

edited

Loading

git-hyagi commented Aug 31, 2024

Hard-coded service name precludes deployment of multiple operators #27

Hard-coded service name precludes deployment of multiple operators #27

Comments

grzleadams commented Apr 22, 2024

grzleadams commented Apr 22, 2024 • edited Loading

grzleadams commented Apr 23, 2024

git-hyagi commented Apr 23, 2024

grzleadams commented Apr 23, 2024

git-hyagi commented Apr 23, 2024

dkliban commented Apr 23, 2024

grzleadams commented Apr 24, 2024

grzleadams commented Apr 24, 2024 • edited Loading

git-hyagi commented Aug 31, 2024

grzleadams commented Apr 22, 2024 •

edited

Loading

grzleadams commented Apr 24, 2024 •

edited

Loading