From e50974c8031507230a4454421e803db8923ff182 Mon Sep 17 00:00:00 2001 From: Andreas Fritzler Date: Tue, 17 Dec 2024 13:38:09 +0100 Subject: [PATCH] Add `ServerMaintenance` CRD --- docs/concepts/servermaintenance.md | 76 ++++++++++++++++++++++++++++++ mkdocs.yml | 1 + 2 files changed, 77 insertions(+) create mode 100644 docs/concepts/servermaintenance.md diff --git a/docs/concepts/servermaintenance.md b/docs/concepts/servermaintenance.md new file mode 100644 index 0000000..c9cfbcf --- /dev/null +++ b/docs/concepts/servermaintenance.md @@ -0,0 +1,76 @@ +# ServerMaintenance + +`ServerMaintenance` represents a maintenance operation for a physical server. It transitions a `Server` from its +current operational state (e.g., Available/Reserved) into a Maintenance state. Each `ServerMaintenance` object tracks the lifecycle of a maintenance task, ensuring servers are properly taken offline, updated, and restored. + +## Key Points + +- `ServerMaintenance` is namespaced and may represent various maintenance operations. +- Only one `ServerMaintenance` can be active per `Server` at a time. Others remain pending. +- When the active `ServerMaintenance` completes, the next pending one (if any) starts. +- If no more maintenance tasks are pending, the `Server` returns to its previous operational state. +- `policy` determines how maintenance starts: + - **OwnerApproval:** Requires a label (e.g., `ok-to-maintenance: "true"`) on the `ServerClaim`. + - **Forceful:** Does not require owner approval. + +## Workflow + +1. A separate operator (e.g., `foo-maintenance-operator`) or user creates a `ServerMaintenance` resource referencing a + specific `Server`. +2. If `policy` is `OwnerApproval` and no `ok-to-maintenance` label is set on the `ServerClaim`, the `ServerMaintenance` + stays in `Pending`. The `Server` also remains unchanged. +3. If `policy` is `OwnerApproval` and the `ok-to-maintenance` label is present (or if `alwaysPerformMaintenance` is + enabled), or if the policy is `Forceful`, the `metal-operator` transitions the `Server` into `Maintenance` and + updates the `ServerMaintenance` state accordingly. +4. The maintenance operator powers off the `Server`, applies a `ServerBootConfiguration` (if needed), performs the + maintenance, and sets `ServerMaintenance` to `Complete`. +5. The `metal-operator` transitions the `Server` back to its prior state. If additional `ServerMaintenance` objects are + pending, the next one is processed. + +## Sequence Diagram + +```mermaid +sequenceDiagram + participant Ext as External entity (e.g. foo-maintenance-operator) + participant SC as ServerClaim + participant MO as metal-operator + participant SM as ServerMaintenance + participant S as Server + participant SBC as ServerBootConfiguration + + Ext->>MO: Create SM (ServerMaintenance) resource for S + MO->>SC: Check for ok-to-maintenance label (if policy=OwnerApproval) + alt Policy = OwnerApproval AND no ok-to-maintenance label + Note over MO: SM stays Pending, Server remains unchanged + else Policy = OwnerApproval AND label present OR Policy = Forceful + MO->>S: Update Server state to "InMaintenance" + MO->>SM: Set SM state to "InMaintenance" + Ext->>S: Power off Server + Ext->>MO: Create SBC and reference in Server + Note over Ext: Perform maintenance + Ext->>SM: Update SM state to "Complete" + MO->>S: Return Server to Available/Reserved + Ext->>S: Power on Server + end +``` + +## Example + +```yaml +apiVersion: metal.ironcore.dev/v1alpha1 +kind: ServerMaintenance +metadata: + name: bios-update + namespace: ops +spec: + policy: OwnerApproval + serverRef: + name: server-foo +status: + state: Pending +``` + +If `policy: OwnerApproval` and no `ok-to-maintenance` label exists on the `ServerClaim`, this `ServerMaintenance` +remains `Pending`, and the `Server` stays as is. Once the label is added (or if the operator setting +`alwaysPerformMaintenance` is enabled), the `metal-operator` transitions the `Server` to `InMaintenance`, and the +maintenance operator performs the maintenance task. diff --git a/mkdocs.yml b/mkdocs.yml index dc84df9..7d0d06f 100644 --- a/mkdocs.yml +++ b/mkdocs.yml @@ -56,6 +56,7 @@ nav: - Servers: concepts/servers.md - ServerBootConfigurations: concepts/serverbootconfigurations.md - ServerClaims: concepts/serverclaims.md + - ServerMaintenance: concepts/servermaintenance.md - Usage: - metalctl: usage/metalctl.md - Development Guide: