You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Add the capability to observe a newly installed Block Node until it reaches a healthy state.
Add the capability to convey readiness of each deployed Block Node via the custom resource status block.
Add the capability to set, monitor, and update configuration for each deployed Block Node
This includes keeping configurations synchronized to the CRD spec block
Ensure all valid configuration values are fully documented and specified
Upgrade the Operator to Level II
Add the capability to upgrade a Block Node via the operator
Add the capability to upgrade all managed Block Nodes when the operator is upgraded
Add the capability to upgrade all (or any subset of) managed Block Nodes to a new Block Node version
Add the capability to upgrade versions of Block Node managed by an older version of the Operator to a version supported by the current version of the Operator.
Add the capability to Upgrade the Operator without upgrading all managed Block Nodes.
If some managed Block Nodes are too old to manage, the Operator may upgrade them to the oldest supported version when upgrading the Operator.
Prior to upgrading, report the inability to manage versions older than the supported range, and the pending upgrade of those versions, via the CRD status block.
Note, we may wish to use the Operator Lifecycle Manager to better support Level II and Level III capabilities
Upgrade the Operator to Level III
Add the capability to create a backup of a Block Node
Add the capability to restore a backup of a Block Node
Add the capability to orchestrate complex re-configuration flows for a Block Node
Items excluded from Level III
The following are not supported, because Block Nodes are not (currently) clustered resources with multiple instances of various components and dynamic scaling
Add the capability to add/remove members from a clustered Block Node
Add the capability to fail-over and fail-back clustered Block Nodes
Add the capability for application-aware dynamic scaling of Block Nodes
Upgrade the Operator to Level IV
Add the capability to expose useful metrics for Operator health
Add the capability to expose health and performance metrics for each Block Node
These should be collected by the Operator and published to Open Telemetry endpoints from there
Add the capability to collect and publish "useful" alerts from managed Block Nodes
"Useful" here refers to symptoms that are associated with end-user pain rather than trying to catch every possible way that pain could be caused. Alerts should link to relevant consoles and make it easy to figure out which component is at fault
Add the capability to emit custom events relating to alert conditions on the managed Block Nodes
Add the capability to use Operator Metering to manage cluster resource consumption.
The text was updated successfully, but these errors were encountered:
jsync-swirlds
changed the title
EPIC: Build a Kubernetes Operator capable of deploying and _managing_ a Block Node and related components through the full lifecycle.
EPIC: Build a Kubernetes Operator capable of deploying and managing a Block Node and related components through the full lifecycle.
Jan 27, 2025
Epic Goal
Produce a fully functional Kubernetes Operator implementing at least "Level IV" capabilities.
Major Tasks
Create a Level I Operator
spec
blockUpgrade the Operator to Level II
status
block.Note, we may wish to use the Operator Lifecycle Manager to better support Level II and Level III capabilities
Upgrade the Operator to Level III
Items excluded from Level III
The following are not supported, because Block Nodes are not (currently) clustered resources with multiple instances of various components and dynamic scaling
Upgrade the Operator to Level IV
The text was updated successfully, but these errors were encountered: