Skip to content

Commit

Permalink
Rename ClusterRing to ControllerRing (#438)
Browse files Browse the repository at this point in the history
* Rename `ClusterRing` type, generate CRD

* Simplify shard and drain labels

* Adapt shard library

* Simplify metric labels

* Adapt controllers and tests

* Adapt example shard

* Adapt webhosting-operator

* Adapt sharding-exporter

* Adapt manifests

* Update docs
timebertt authored Jan 24, 2025

Verified

This commit was signed with the committer’s verified signature.
sr-gi Sergi Delgado
1 parent 489a4bd commit 9506c28
Showing 55 changed files with 460 additions and 491 deletions.
2 changes: 1 addition & 1 deletion Makefile
Original file line number Diff line number Diff line change
@@ -141,7 +141,7 @@ SHARD_NAME ?= shard-$(shell tr -dc bcdfghjklmnpqrstvwxz2456789 </dev/urandom | h

.PHONY: run-shard
run-shard: $(KUBECTL) ## Run a shard from your host and deploy prerequisites.
$(KUBECTL) apply --server-side --force-conflicts -k hack/config/shard/clusterring
$(KUBECTL) apply --server-side --force-conflicts -k hack/config/shard/controllerring
go run ./cmd/shard --shard=$(SHARD_NAME) --lease-namespace=default --zap-log-level=debug

PUSH ?= false
6 changes: 3 additions & 3 deletions README.md
Original file line number Diff line number Diff line change
@@ -43,9 +43,9 @@ It distributes reconciliation of Kubernetes objects across multiple controller i
For this, the project applies proven sharding mechanisms used in distributed databases to Kubernetes controllers.

The project introduces a `sharder` component that implements sharding in a generic way and can be applied to any Kubernetes controller (independent of the used programming language and controller framework).
The `sharder` component is installed into the cluster along with a `ClusterRing` custom resource.
A `ClusterRing` declares a virtual ring of sharded controller instances and specifies API resources that should be distributed across shards in the ring.
It configures sharding on the cluster-scope level (i.e., objects in all namespaces), hence the `ClusterRing` name.
The `sharder` component is installed into the cluster along with a `ControllerRing` custom resource.
A `ControllerRing` declares a virtual ring of sharded controller instances and specifies API resources that should be distributed across shards in the ring.
It configures sharding on the cluster-scope level (i.e., objects in all namespaces), hence the `ControllerRing` name.

The watch cache is an expensive part of a controller regarding network transfer, CPU (decoding), and memory (local copy of all objects).
When running multiple instances of a controller, the individual instances must thus only watch the subset of objects they are responsible for.
26 changes: 13 additions & 13 deletions cmd/shard/main.go
Original file line number Diff line number Diff line change
@@ -79,10 +79,10 @@ running a full controller that complies with the sharding requirements.`,
}

type options struct {
zapOptions *zap.Options
clusterRingName string
leaseNamespace string
shardName string
zapOptions *zap.Options
controllerRingName string
leaseNamespace string
shardName string
}

func newOptions() *options {
@@ -92,12 +92,12 @@ func newOptions() *options {
TimeEncoder: zapcore.ISO8601TimeEncoder,
},

clusterRingName: "example",
controllerRingName: "example",
}
}

func (o *options) AddFlags(fs *pflag.FlagSet) {
fs.StringVar(&o.clusterRingName, "clusterring", o.clusterRingName, "Name of the ClusterRing the shard belongs to.")
fs.StringVar(&o.controllerRingName, "controllerring", o.controllerRingName, "Name of the ControllerRing the shard belongs to.")
fs.StringVar(&o.leaseNamespace, "lease-namespace", o.leaseNamespace, "Namespace to use for the shard lease. Defaults to the pod's namespace if running in-cluster.")
fs.StringVar(&o.shardName, "shard", o.shardName, "Name of the shard. Defaults to the instance's hostname.")

@@ -107,8 +107,8 @@ func (o *options) AddFlags(fs *pflag.FlagSet) {
}

func (o *options) validate() error {
if o.clusterRingName == "" {
return fmt.Errorf("--clusterring must not be empty")
if o.controllerRingName == "" {
return fmt.Errorf("--controllerring must not be empty")
}

return nil
@@ -127,9 +127,9 @@ func (o *options) run(ctx context.Context) error {

log.Info("Setting up shard lease")
shardLease, err := shardlease.NewResourceLock(restConfig, nil, shardlease.Options{
ClusterRingName: o.clusterRingName,
LeaseNamespace: o.leaseNamespace, // optional, can be empty
ShardName: o.shardName, // optional, can be empty
ControllerRingName: o.controllerRingName,
LeaseNamespace: o.leaseNamespace, // optional, can be empty
ShardName: o.shardName, // optional, can be empty
})
if err != nil {
return fmt.Errorf("failed creating shard lease: %w", err)
@@ -161,7 +161,7 @@ func (o *options) run(ctx context.Context) error {
// If your shard watches sharded objects as well as non-sharded objects, use cache.Options.ByObject to configure
// the label selector on object level.
DefaultLabelSelector: labels.SelectorFromSet(labels.Set{
shardingv1alpha1.LabelShard(shardingv1alpha1.KindClusterRing, "", o.clusterRingName): shardLease.Identity(),
shardingv1alpha1.LabelShard(o.controllerRingName): shardLease.Identity(),
}),
},
})
@@ -170,7 +170,7 @@ func (o *options) run(ctx context.Context) error {
}

log.Info("Setting up controller")
if err := (&Reconciler{}).AddToManager(mgr, o.clusterRingName, shardLease.Identity()); err != nil {
if err := (&Reconciler{}).AddToManager(mgr, o.controllerRingName, shardLease.Identity()); err != nil {
return fmt.Errorf("failed adding controller: %w", err)
}

6 changes: 3 additions & 3 deletions cmd/shard/reconciler.go
Original file line number Diff line number Diff line change
@@ -44,7 +44,7 @@ type Reconciler struct {
}

// AddToManager adds Reconciler to the given manager.
func (r *Reconciler) AddToManager(mgr manager.Manager, clusterRingName, shardName string) error {
func (r *Reconciler) AddToManager(mgr manager.Manager, controllerRingName, shardName string) error {
if r.Client == nil {
r.Client = mgr.GetClient()
}
@@ -55,15 +55,15 @@ func (r *Reconciler) AddToManager(mgr manager.Manager, clusterRingName, shardNam
// - wrapping the actual reconciler a reconciler that handles the drain operation for us
return builder.ControllerManagedBy(mgr).
Named("configmap").
For(&corev1.ConfigMap{}, builder.WithPredicates(shardcontroller.Predicate(clusterRingName, shardName, ConfigMapDataChanged(), predicate.GenerationChangedPredicate{}))).
For(&corev1.ConfigMap{}, builder.WithPredicates(shardcontroller.Predicate(controllerRingName, shardName, ConfigMapDataChanged(), predicate.GenerationChangedPredicate{}))).
Owns(&corev1.Secret{}, builder.WithPredicates(ObjectDeleted())).
WithOptions(controller.Options{
MaxConcurrentReconciles: 5,
}).
Complete(
shardcontroller.NewShardedReconciler(mgr).
For(&corev1.ConfigMap{}).
InClusterRing(clusterRingName).
InControllerRing(controllerRingName).
WithShardName(shardName).
MustBuild(r),
)
2 changes: 1 addition & 1 deletion cmd/sharder/app/options.go
Original file line number Diff line number Diff line change
@@ -183,7 +183,7 @@ func (o *options) applyCacheOptions() {
// filter lease cache for shard leases to avoid watching all leases in cluster
leaseSelector := labels.NewSelector()
{
ringRequirement, err := labels.NewRequirement(shardingv1alpha1.LabelClusterRing, selection.Exists, nil)
ringRequirement, err := labels.NewRequirement(shardingv1alpha1.LabelControllerRing, selection.Exists, nil)
utilruntime.Must(err)
leaseSelector.Add(*ringRequirement)
}
2 changes: 1 addition & 1 deletion config/crds/kustomization.yaml
Original file line number Diff line number Diff line change
@@ -6,4 +6,4 @@ commonLabels:

resources:
- namespace.yaml
- sharding.timebertt.dev_clusterrings.yaml
- sharding.timebertt.dev_controllerrings.yaml
Original file line number Diff line number Diff line change
@@ -4,14 +4,14 @@ kind: CustomResourceDefinition
metadata:
annotations:
controller-gen.kubebuilder.io/version: v0.17.1
name: clusterrings.sharding.timebertt.dev
name: controllerrings.sharding.timebertt.dev
spec:
group: sharding.timebertt.dev
names:
kind: ClusterRing
listKind: ClusterRingList
plural: clusterrings
singular: clusterring
kind: ControllerRing
listKind: ControllerRingList
plural: controllerrings
singular: controllerring
scope: Cluster
versions:
- additionalPrinterColumns:
@@ -31,8 +31,9 @@ spec:
schema:
openAPIV3Schema:
description: |-
ClusterRing declares a virtual ring of sharded controller instances. The specified objects are distributed across
shards of this ring on the cluster-scope (i.e., objects in all namespaces). Hence, the "Cluster" prefix.
ControllerRing declares a virtual ring of sharded controller instances. Objects of the specified resources are
distributed across shards of this ring. Objects in all namespaces are considered unless a namespaceSelector is
specified.
properties:
apiVersion:
description: |-
@@ -53,7 +54,7 @@ spec:
type: object
spec:
description: Spec contains the specification of the desired behavior of
the ClusterRing.
the ControllerRing.
properties:
namespaceSelector:
description: |-
@@ -108,14 +109,14 @@ spec:
x-kubernetes-map-type: atomic
resources:
description: Resources specifies the list of resources that are distributed
across shards in this ClusterRing.
across shards in this ControllerRing.
items:
description: RingResource specifies a resource along with controlled
resources that is distributed across shards in a ring.
properties:
controlledResources:
description: |-
ControlledResources are additional resources that are distributed across shards in the ClusterRing.
ControlledResources are additional resources that are distributed across shards in the ControllerRing.
These resources are controlled by the controller's main resource, i.e., they have an owner reference with
controller=true back to the GroupResource of this RingResource.
Typically, the controller also watches objects of this resource and enqueues the owning object (of the main
@@ -154,7 +155,7 @@ spec:
type: object
status:
description: Status contains the most recently observed status of the
ClusterRing.
ControllerRing.
properties:
availableShards:
description: AvailableShards is the total number of available shards
@@ -224,7 +225,7 @@ spec:
- type
x-kubernetes-list-type: map
observedGeneration:
description: The generation observed by the ClusterRing controller.
description: The generation observed by the ControllerRing controller.
format: int64
type: integer
shards:
2 changes: 1 addition & 1 deletion config/monitoring/sharding-exporter/clusterrole.yaml
Original file line number Diff line number Diff line change
@@ -22,7 +22,7 @@ rules:
- apiGroups:
- sharding.timebertt.dev
resources:
- clusterrings
- controllerrings
verbs:
- get
- list
18 changes: 9 additions & 9 deletions config/monitoring/sharding-exporter/config.yaml
Original file line number Diff line number Diff line change
@@ -10,7 +10,7 @@ spec:
labelsFromPath:
namespace: [metadata, namespace]
shard: [metadata, name]
clusterring: [metadata, labels, alpha.sharding.timebertt.dev/clusterring]
controllerring: [metadata, labels, alpha.sharding.timebertt.dev/controllerring]
metrics:
- name: info
help: "Information about a Shard"
@@ -30,36 +30,36 @@ spec:
# The usual leader election leases don't have the state label making the generator log errors.
# Hence, decrease verbosity of such errors to reduce distraction.
errorLogV: 4
# clusterring metrics
- metricNamePrefix: kube_clusterring
# controllerring metrics
- metricNamePrefix: kube_controllerring
groupVersionKind:
group: sharding.timebertt.dev
version: v1alpha1
kind: ClusterRing
kind: ControllerRing
labelsFromPath:
clusterring: [metadata, name]
controllerring: [metadata, name]
uid: [metadata, uid]
metrics:
- name: metadata_generation
help: "The generation of a ClusterRing"
help: "The generation of a ControllerRing"
each:
type: Gauge
gauge:
path: [metadata, generation]
- name: observed_generation
help: "The latest generation observed by the ClusterRing controller"
help: "The latest generation observed by the ControllerRing controller"
each:
type: Gauge
gauge:
path: [status, observedGeneration]
- name: status_shards
help: "The ClusterRing's total number of shards observed by the ClusterRing controller"
help: "The ControllerRing's total number of shards observed by the ControllerRing controller"
each:
type: Gauge
gauge:
path: [status, shards]
- name: status_available_shards
help: "The ClusterRing's number of available shards observed by the ClusterRing controller"
help: "The ControllerRing's number of available shards observed by the ControllerRing controller"
each:
type: Gauge
gauge:
4 changes: 2 additions & 2 deletions config/rbac/role.yaml
Original file line number Diff line number Diff line change
@@ -40,15 +40,15 @@ rules:
- apiGroups:
- sharding.timebertt.dev
resources:
- clusterrings
- controllerrings
verbs:
- get
- list
- watch
- apiGroups:
- sharding.timebertt.dev
resources:
- clusterrings/status
- controllerrings/status
verbs:
- patch
- update
2 changes: 1 addition & 1 deletion docs/assets/architecture.svg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
12 changes: 6 additions & 6 deletions docs/design.md
Original file line number Diff line number Diff line change
@@ -23,17 +23,17 @@ Notably, no leader election is performed, and there is no designated single acti
Instead, each controller instance maintains an individual shard `Lease` labeled with the ring's name, allowing them to announce themselves to the sharder for membership and failure detection.
The sharder watches these leases to build a hash ring with the available instances.

### The `ClusterRing` Resource and Sharder Webhook
### The `ControllerRing` Resource and Sharder Webhook

Rings of controllers are configured through the use of the `ClusterRing` custom resource.
The sharder creates a `MutatingWebhookConfiguration` for each `ClusterRing` to perform assignments for objects associated with the ring.
Rings of controllers are configured through the use of the `ControllerRing` custom resource.
The sharder creates a `MutatingWebhookConfiguration` for each `ControllerRing` to perform assignments for objects associated with the ring.
The sharder webhook is called on `CREATE` and `UPDATE` requests for configured resources, but only for objects that don't have the ring-specific shard label, i.e., for unassigned objects.

The sharder uses the consistent hashing ring to determine the desired shard and adds the shard label during admission accordingly.
Shards then use a label selector for the shard label with their own instance name to restrict the cache and controller to the subset of objects assigned to them.

For the controller's "main" object (configured in `ClusterRing.spec.resources[]`), the object's `apiVersion`, `kind`, `namespace`, and `name` are concatenated to form its hash key.
For objects controlled by other objects (configured in `ClusterRing.spec.resources[].controlledResources[]`), the sharder utilizes information about the controlling object (`ownerReference` with `controller=true`) to calculate the object's hash key.
For the controller's "main" object (configured in `ControllerRing.spec.resources[]`), the object's `apiVersion`, `kind`, `namespace`, and `name` are concatenated to form its hash key.
For objects controlled by other objects (configured in `ControllerRing.spec.resources[].controlledResources[]`), the sharder utilizes information about the controlling object (`ownerReference` with `controller=true`) to calculate the object's hash key.
This ensures that owned objects are consistently assigned to the same shard as their owner.

### Object Movements and Rebalancing
@@ -88,7 +88,7 @@ The comparisons show that the sharder's resource consumption is almost constant
### Minimize Impact on the Critical Path

While the use of mutating webhooks might allow dropping watches for the sharded objects, they can have a significant impact on API requests, e.g., regarding request latency.
To minimize the impact of the sharder's webhook on the overall request latency, the webhook is configured to only react on precisely the set of objects configured in the `ClusterRing` and only for `CREATE` and `UPDATE` requests of unassigned objects.
To minimize the impact of the sharder's webhook on the overall request latency, the webhook is configured to only react on precisely the set of objects configured in the `ControllerRing` and only for `CREATE` and `UPDATE` requests of unassigned objects.
With this the webhook is only on the critical path during initial object creation and whenever the set of available shards requires reassignments.

Furthermore, webhooks can cause API requests to fail entirely.
25 changes: 13 additions & 12 deletions docs/development.md
Original file line number Diff line number Diff line change
@@ -83,7 +83,7 @@ Assuming a fresh kind cluster:
make run
```

Now, create the `example` `ClusterRing` and run a local shard:
Now, create the `example` `ControllerRing` and run a local shard:

```bash
make run-shard
@@ -92,13 +92,13 @@ make run-shard
You should see that the shard successfully announced itself to the sharder:

```bash
$ kubectl get lease -L alpha.sharding.timebertt.dev/clusterring,alpha.sharding.timebertt.dev/state
NAME HOLDER AGE CLUSTERRING STATE
shard-h9np6f8c shard-h9np6f8c 8s example ready
$ kubectl get lease -L alpha.sharding.timebertt.dev/controllerring,alpha.sharding.timebertt.dev/state
NAME HOLDER AGE CONTROLLERRING STATE
shard-fkpxhjk8 shard-fkpxhjk8 18s example ready

$ kubectl get clusterring
$ kubectl get controllerring
NAME READY AVAILABLE SHARDS AGE
example True 1 1 15s
example True 1 1 34s
```

Running the shard locally gives you the option to test non-graceful termination, i.e., a scenario where the shard fails to renew its lease in time.
@@ -113,19 +113,20 @@ make run-shard

## Testing the Sharding Setup

Independent of the used setup (skaffold-based or running on the host machine), you should be able to create sharded `ConfigMaps` in the `default` namespace as configured in the `example` `ClusterRing`.
Independent of the used setup (skaffold-based or running on the host machine), you should be able to create sharded `ConfigMaps` in the `default` namespace as configured in the `example` `ControllerRing`.
The `Secrets` created by the example shard controller should be assigned to the same shard as the owning `ConfigMap`:

```bash
$ kubectl create cm foo
configmap/foo created

$ kubectl get cm,secret -L shard.alpha.sharding.timebertt.dev/clusterring-50d858e0-example
NAME DATA AGE CLUSTERRING-50D858E0-EXAMPLE
configmap/foo 0 3s shard-5fc87c9fb7-kfb2z
$ kubectl get cm,secret -L shard.alpha.sharding.timebertt.dev/50d858e0-example
NAME DATA AGE 50D858E0-EXAMPLE
configmap/foo 0 1s shard-656d588475-5746d

NAME TYPE DATA AGE CLUSTERRING-50D858E0-EXAMPLE
secret/dummy-foo Opaque 0 3s shard-5fc87c9fb7-kfb2z
NAME TYPE DATA AGE 50D858E0-EXAMPLE
secret/dummy-foo Opaque 0 1s shard-656d588475-5746d
secret/dummy-kube-root-ca.crt Opaque 0 2m14s
```

## Monitoring
Loading

0 comments on commit 9506c28

Please sign in to comment.