fix: recreate autoscaler resources if deleted or modified #2409

blumamir · 2025-02-08T04:08:04Z

If for any reason, the data-collection daemonset, gateway deployment, or one of the config maps is deleted, then it's not re-created at the moment.

This PR makes it so they are.

We had a bug where the collectors group would get deleted, and it also deleted the daemonset. then if the collectors group is re-created immediately, the daemonset would stay deleted and odigos would not work

damemi · 2025-02-08T04:10:52Z

great fix, do you think it would be worth adding a test? Something that just creates a pipeline, deletes the pod, and checks to see that it comes back

blumamir · 2025-02-08T04:18:21Z

great fix, do you think it would be worth adding a test? Something that just creates a pipeline, deletes the pod, and checks to see that it comes back

While it's great to have many tests, I feel this behavior is really niche and the value we will get from asserting it is low compared to the time and ongoing maintenance we would have to invest in the test.

But I'm open to be convinced otherwise if you feel it's important.

RonFed · 2025-02-08T07:52:28Z

autoscaler/controllers/collectorsgroup_controller.go

@@ -61,6 +63,9 @@ func (r *CollectorsGroupReconciler) Reconcile(ctx context.Context, req ctrl.Requ
 func (r *CollectorsGroupReconciler) SetupWithManager(mgr ctrl.Manager) error {
 	return ctrl.NewControllerManagedBy(mgr).
 		For(&odigosv1.CollectorsGroup{}).
+		Owns(&appsv1.DaemonSet{}).  // in case the ds is deleted or modified for any reason, this will reconcile and recreate it
+		Owns(&appsv1.Deployment{}). // in case the deployment is deleted or modified for any reason, this will reconcile and recreate it
+		Owns(&corev1.ConfigMap{}).  // in case the configmap is deleted or modified for any reason, this will reconcile and recreate it


I think we should update the event filter.
This will add a lot of events which we can easily filter by namespace.
Or do we filter in the cache?

Not sure exactly how it's implemented, but i've tested it to work, and I believe the reconcile will be called only if the resource has an owner of this type which we do.

Anyhow, the cache pulls in only the relevant objects, so this will not trigger on every daemonset or deployment in the cluster, just those in cache

fix: recreate autoscaler resources if deleted or modified

5598dc5

BenElferink approved these changes Feb 8, 2025

View reviewed changes

BenElferink added the bug Something isn't working label Feb 8, 2025

RonFed reviewed Feb 8, 2025

View reviewed changes

RonFed approved these changes Feb 8, 2025

View reviewed changes

BenElferink enabled auto-merge (squash) February 8, 2025 19:10

Merge branch 'main' into recreate-autoscaler-res

5ac590e

BenElferink merged commit 09fceb4 into odigos-io:main Feb 8, 2025
45 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: recreate autoscaler resources if deleted or modified #2409

fix: recreate autoscaler resources if deleted or modified #2409

blumamir commented Feb 8, 2025

damemi commented Feb 8, 2025

blumamir commented Feb 8, 2025

RonFed Feb 8, 2025 •

edited

Loading

blumamir Feb 8, 2025

fix: recreate autoscaler resources if deleted or modified #2409

fix: recreate autoscaler resources if deleted or modified #2409

Conversation

blumamir commented Feb 8, 2025

damemi commented Feb 8, 2025

blumamir commented Feb 8, 2025

RonFed Feb 8, 2025 • edited Loading

Choose a reason for hiding this comment

blumamir Feb 8, 2025

Choose a reason for hiding this comment

RonFed Feb 8, 2025 •

edited

Loading