Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: recreate autoscaler resources if deleted or modified #2409

Merged
merged 2 commits into from
Feb 8, 2025

Conversation

blumamir
Copy link
Collaborator

@blumamir blumamir commented Feb 8, 2025

If for any reason, the data-collection daemonset, gateway deployment, or one of the config maps is deleted, then it's not re-created at the moment.

This PR makes it so they are.

We had a bug where the collectors group would get deleted, and it also deleted the daemonset. then if the collectors group is re-created immediately, the daemonset would stay deleted and odigos would not work

@damemi
Copy link
Contributor

damemi commented Feb 8, 2025

great fix, do you think it would be worth adding a test? Something that just creates a pipeline, deletes the pod, and checks to see that it comes back

@blumamir
Copy link
Collaborator Author

blumamir commented Feb 8, 2025

great fix, do you think it would be worth adding a test? Something that just creates a pipeline, deletes the pod, and checks to see that it comes back

While it's great to have many tests, I feel this behavior is really niche and the value we will get from asserting it is low compared to the time and ongoing maintenance we would have to invest in the test.

But I'm open to be convinced otherwise if you feel it's important.

@BenElferink BenElferink added the bug Something isn't working label Feb 8, 2025
@@ -61,6 +63,9 @@ func (r *CollectorsGroupReconciler) Reconcile(ctx context.Context, req ctrl.Requ
func (r *CollectorsGroupReconciler) SetupWithManager(mgr ctrl.Manager) error {
return ctrl.NewControllerManagedBy(mgr).
For(&odigosv1.CollectorsGroup{}).
Owns(&appsv1.DaemonSet{}). // in case the ds is deleted or modified for any reason, this will reconcile and recreate it
Owns(&appsv1.Deployment{}). // in case the deployment is deleted or modified for any reason, this will reconcile and recreate it
Owns(&corev1.ConfigMap{}). // in case the configmap is deleted or modified for any reason, this will reconcile and recreate it
Copy link
Collaborator

@RonFed RonFed Feb 8, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should update the event filter.
This will add a lot of events which we can easily filter by namespace.
Or do we filter in the cache?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure exactly how it's implemented, but i've tested it to work, and I believe the reconcile will be called only if the resource has an owner of this type which we do.

Anyhow, the cache pulls in only the relevant objects, so this will not trigger on every daemonset or deployment in the cluster, just those in cache

@BenElferink BenElferink enabled auto-merge (squash) February 8, 2025 19:10
@BenElferink BenElferink merged commit 09fceb4 into odigos-io:main Feb 8, 2025
45 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants