-
Notifications
You must be signed in to change notification settings - Fork 6
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
Showing
2 changed files
with
60 additions
and
0 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,59 @@ | ||
# ServerMaintenance | ||
|
||
`ServerMaintenance` represents a maintenance operation for a physical server. It transitions a `Server` from an | ||
operational state (e.g., Available/Reserved) into a Maintenance state. Each `ServerMaintenance` object tracks the | ||
lifecycle of a specific maintenance task, ensuring servers are properly taken offline, updated, and restored. | ||
|
||
## Key Points | ||
|
||
- `ServerMaintenance` is namespaced and can represent various maintenance types (e.g., BIOSUpdate, Cleanup). | ||
- Only one `ServerMaintenance` can be active for a given `Server` at a time. Others remain pending. | ||
- When the active `ServerMaintenance` completes, the next pending maintenance starts. | ||
- If no more maintenance tasks are pending, the `Server` returns to its previous operational state and can be | ||
powered back on. | ||
- The `metal-operator` manages `ServerMaintenance` resources and updates the `Server` state accordingly. | ||
- A maintenance-related operator (e.g., `firmware-operator`) decides if a `Server` needs maintenance, creates the | ||
`ServerMaintenance` resource, and a corresponding `ServerBootConfiguration`, and references it as a | ||
MaintenanceBootConfiguration in the `Server` spec. It also handles powering servers on/off. | ||
|
||
## Workflow | ||
|
||
1. **Determining Maintenance:** | ||
The maintenance operator creates a `ServerMaintenance` resource for the chosen `Server`. | ||
|
||
2. **Transition to Maintenance:** | ||
The `metal-operator` notices the new `ServerMaintenance`, transitions the `Server` into `Maintenance` status. | ||
|
||
3. **Boot Configuration:** | ||
The maintenance operator creates a `ServerBootConfiguration` resource and references it in the `Server` spec as the | ||
MaintenanceBootConfiguration. This configuration is used to e.g. boot custom tooling to perform BIOS/firmware updates | ||
or run cleanup tasks on the `Server`. | ||
|
||
4. **Performing Maintenance:** | ||
The maintenance operator powers off the `Server`, performs the required maintenance task, then updates the | ||
`ServerMaintenance` to `Complete`. | ||
|
||
5. **Post-Maintenance:** | ||
Once complete, if no more maintenance tasks are pending, the `metal-operator` restores the `Server` to its previous | ||
state (e.g., Available/Reserved). The maintenance operator can then power the `Server` back on. | ||
|
||
## Example ServerMaintenance Resource | ||
|
||
```yaml | ||
apiVersion: metal.ironcore.dev/v1alpha1 | ||
kind: ServerMaintenance | ||
metadata: | ||
name: bios-update | ||
namespace: ops | ||
spec: | ||
type: BIOSUpdate | ||
serverRef: | ||
name: server-foo | ||
status: | ||
state: Pending | ||
``` | ||
Once conditions are met, the `metal-operator` transitions the `Server` to `Maintenance`. The maintenance operator | ||
powers the server off, applies the maintenance boot configuration, performs the maintenance, marks `ServerMaintenance` | ||
as Complete, and then powers the server on. If multiple `ServerMaintenance` objects exist, the next pending one starts | ||
next; otherwise, the `Server` returns to its previous operational state. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters