High memory utilization on application controller pods that manage clusters with few apps, but many unrelated resources #21546
-
Hey folks, We run a hub-spoke model, where we have a single Argo instance (running version We noticed that one pod consistently uses 3-4x the memory of other pods and was regularly getting OOMKilled. Through manual shard assignment, we narrowed down the culprit to a cluster that has many more resources in it than other clusters, but relatively fewer apps, and those apps do not produce the number of resources reported by the cluster (it's a much smaller amount). Here's the memory usage for the pod that controls the cluster with many resources, as well as the section from the Cluster UI that shows the number of apps and resources in the cluster: For contrast, here is another cluster with more apps but fewer resources and its memory usage profile: For additional context, we have Orphaned Resource detection disabled in all projects, and the pod with high memory usage is only responsible for managing the one cluster with lots of resources, so all memory utilization can be attributed to that single cluster. Is this behavior expected? I haven't dug into the code but it seems like the controller queries all resources in the cluster it manages, then filters out resources it cares about based on the apps in that cluster, rather than doing targeted queries against the API. This leads to very high memory utilization just by virtue of the target cluster having many objects. |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment 1 reply
-
It is expected. Argo CD knows which resources you have in git, but it doesn't know what resources might be children of those resources. So it has to load every single resource on the cluster, inspect each resource's owner references, and than build a resource tree out of the resources that it can confirm are actually relevant to your Applications. But you can provide Argo CD with hints. For example, you can use Aside: I wouldn't bother sharding. If you have nodes which can accommodate a single controller shard handling all your clusters, just run a single shard. Otherwise you're complicating the setup and spreading (some) duplicate "house keeping" work across multiple shards when it could be consolidated into just one. |
Beta Was this translation helpful? Give feedback.
It is expected. Argo CD knows which resources you have in git, but it doesn't know what resources might be children of those resources. So it has to load every single resource on the cluster, inspect each resource's owner references, and than build a resource tree out of the resources that it can confirm are actually relevant to your Applications.
But you can provide Argo CD with hints. For example, you can use
resource.exclu…