-
Notifications
You must be signed in to change notification settings - Fork 51
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Discussion on closed-loop and control theory #24
Comments
Hi @todaywasawesome @scottrigby I view Continuous Delivery (CD) as "weak" GitOps, and Continuous Operations as "strong" GitOps. My understanding of our agenda here is whether to require "strong" GitOps by adding a principle about a closed loop control system. From @todaywasawesome
From @scottrigby
From @cdavisafc
@joebowbeer's description about "weak" GitOps versus "strong" GitOps captures the essence of this topic: From @joebowbeer via aws-samples/eks-workshop#1162 (comment)
Concrete examples of "weak" GitOps are Operations by Pull Request via Amazon ECS; there are examples by @rizblie at https://cicd-for-ecs.workshop.aws/en/5-advanced/lab4-gitops.html and https://cicd-for-ecs.workshop.aws/en/6-other/lab-terraform.html Concrete examples of "strong" GitOps are Integrated Policy Enforcement via Amazon EKS that @mikestef9 is looking to get input at • Amazon EKS: AWS Containers Roadmap: [EKS] [request]: Integrated Policy Enforcement In the above examples, a single vendor, Amazon Web Services (AWS), offers "weak" GitOps via Amazon ECS, and "strong" GitOps via Amazon EKS. Concrete examples from different vendors: "strong" GitOps via Microsoft Azure, by @v-thepet, @EdPrice-MSFT, @v-kents, @alexhart11
https://docs.microsoft.com/en-us/azure/architecture/example-scenario/gitops-aks/gitops-blueprint-aks "strong" GitOps via Google Cloud, by @crcsmnky and @ggalloro • GitOps Con 2021: Shifting Policy Enforcement to the Left using GitOps: Open Policy Agent (OPA) Gatekeeper by @crcsmnky https://www.youtube.com/watch?v=XvQZ3ZDjRls • GitOps Days 2021: Using Source Code Management Patterns to Configure & Secure Kubernetes Clusters: ACM Policy Controller, based on Open Policy Agent (OPA) Gatekeeper by @ggalloro https://www.youtube.com/watch?v=u2rmx-2MwNA "strong" GitOps via @redhat-cop Red Hat Community of Practice • Automate Your Security Practices and Policies on OpenShift With Open Policy Agent by @garethahealy, @wmcdonald404, @noelo, @monodot
As for less-vendor-specific examples: "weak" GitOps: • Spinnaker when used without Kubernetes at all https://spinnaker.io/docs/guides/user/pipeline/triggers/github/ • Spinnaker: Application Deployment https://spinnaker.io/docs/concepts/#application-deployment • Spinnaker supports application deployments without Kubernetes, and with Kubernetes "strong" GitOps: • Spinnaker when used with Kubernetes https://spinnaker.io/docs/setup/install/providers/kubernetes-v2/ • Argo CD which requires Kubernetes https://argo-cd.readthedocs.io/en/stable/operator-manual/architecture/#application-controller • Flux CD which requires Kubernetes https://fluxcd.io/docs/components/ • Jenkins X which requires Kubernetes https://jenkins-x.io/v3/develop/reference/jx/gitops/ • Open Policy Agent (OPA) Gatekeeper - Policy Controller for Kubernetes https://github.com/open-policy-agent/gatekeeper When K8s is used, there is implicitly "strong" GitOps because K8s is a closed loop control system, explained at https://kubernetes.io/docs/concepts/architecture/controller/
... similar to the definition from https://en.wikipedia.org/wiki/Control_theory#Open-loop_and_closed-loop_(feedback)_control
The above relate to my note at #22 (comment) |
For me feedback in a GitOps system is from the perspective of the desired state, i.e. If I look at the desired state in Git can I see if that state has been applied before I make a decision as to whether I can safely make a change to that desired state? If you do not have this feedback, you cannot determine if you are making changes based on an actual state or a desired state that has not yet been realized. Deploying either https://fluxcd.io/docs/components/notification/ or https://github.com/Azure/gitops-connector would close the loop on a standard flux implementation. PID loops (closed loops that use metrics to make decisions) are orthogonal to GitOps based systems - they occur at an abstraction layer 1 level above (e.g. automatically committing the results of the decision to git) or 1 level below (as a controller that uses a policy defined in the desired state). In either case, I don't think this can be a principle as it would raise the barrier to entry significantly - We should however add closed loops as a best practice and possibly create a guide on implementing PID loops in GitOps compatible way. |
Thank you, @moshloop, for clarifying PID loops at different abstraction layers. If
While this discussion, ... I respectfully disagree with the assertion, In practice, principles are open for interpretation. Some people view principles holistically, while other people view principles incrementally as maturity levels (from 1 to Regardless of the content within principles... The marketing and branding of GitOps or GitOps-like would be used, either with or without GitOps certification, because GitOps is usually a means, and not an end. To use an analogy: Our agenda is about making the idea, @moshloop @todaywasawesome @scottrigby Our GitOps Working Group may be interested in @colmmacc's talk, PID Loops and the Art of Keeping Systems Stable: Control Theory: Where the fruit is hanging so low IT IS TOUCHING THE GROUND In particular, starting at 22:13 until 28:07
|
My instinct is to include it in the principles. I don't think it's exclusionary and it solidifies it as an "actual thing". |
@christianh814 I personally lean that way too. I believe this also helps explain how declarative configurations for things like notification on divergence, rollbacks, and even intelligent agent responses like progressive delivery, can fit into GitOps principles when done properly. @moshloop Thanks yes, IMO raising barrier of entry is a very important concern for principles, but I think if addressed properly it does not need to raise the barrier, but rather help differentiate from systems where zero feedback from the system (including a count of previous reconciliation attempts) is taken into account by the software agents. 🙏 Also requesting response from @chrispat @csand-msft @todaywasawesome @jlbutler @murillodigital, all other WG members, and anyone else who wants to weigh in on this today or tomorrow, before the v1.0.0 release is scheduled. Thanks! OK, here was the wording we were debating for RC 1 (see “items left out of this PR”) but did not yet include in that or RC 2 due to not enough group response by that point. How do you think this reads?:
|
Even if this is made a principle it would still need to distinguish between the 2 different loops in operation.
i.e. if the principle includes both loops then a Flux deployment without the notification controller would not be "GitOps compliant" and that would be pretty confusing |
@scottrigby ah, sorry I was working on a PR and didn't see your comment. @moshloop checkout the PR and see if that resolves your concern. I'm not completely clear on what you mean. #31 |
How you can help now
Please read Dan’s good issue summary (original issue below), which came out of our last group meeting. The gist of the question is, do we think the originally included “closed loop” principle should be added back before the planned v1.0.0 release milestone (scheduled for EOW, no later than Monday Oct 11)
We are now asking YOU – WG members & maintainers – to 👍 or 👎 this issue by Friday Oct 8th. (ideally, also comment a short reason).
Note this is not a decision for all time, but just for this first full release. We can continue to discuss for possible inclusion after v1.0.0 if there is too much divergence in opinion. It was left out of RC 1 (see “items left out of this PR”) and not yet included in RC 2 because there was not enough group response so far.
Original issue
This came up as part of an ongoing discussion around #22 with @squaremo @lloydchang @scottrigby @murillodigital.
#22 (comment)
When we did #21 we removed the closed-loop because we couldn't really articulate why it was there. Some of the discussions referenced above brought back some of the ideas of why we had included closed-loop in the first place.
I'm not very familiar with control theory so the usage of the word "feedback" really threw me off. Feedback seemed like something could happen to your actual state that would then somehow inform your desired state. After going through it with @scottrigby, the way control theory uses "feedback" is that there is a recognition of what is actually occurring in the system and that it is taken into account.
On some level, I think this is encapsulated in two principles "declarative" + "continuously reconciled". With those two ideas, I think you can probably get to closed-loop. For example, if you were to create a progressive delivery plan that involved checking metrics and then making a decision to rollback or move ahead I would view that as part of GitOps as long as it is declaratively expressed and continuously reconciled. The reconciliation implies the idea of closed-loop feedback.
However, as it has come up over and over again and @scottrigby has pointed out, it may be worthwhile making that idea clearly explicit by making it a principle.
For #22 I think we're time boxing to move toward GitOps V1 but if we:
then we may move ahead to include it in v1. So long as it does not introduce a delay.
In other words: if this is important to you, please fight for it here and give your ideas for how to communicate it.
The text was updated successfully, but these errors were encountered: