Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding sidecars to nodepools for any additional functionality #928

Open
wants to merge 3 commits into
base: main
Choose a base branch
from

Conversation

samof76
Copy link

@samof76 samof76 commented Jan 6, 2025

Description

Adding sidecars functionality to nodepools

Issues Resolved

No issue was created but i will try to explain why we would need sidecars. Since containers run their own network namespace, that network cannot be monitored from node-exporter that is running on the node as daemonset. So to monitor the network namespace it critical to run the node-exporter as a sidecar.

Especially in systems where there is huge network through put, it is very critical to run it.

Check List

  • Commits are signed per the DCO using --signoff
  • Unittest added for the new/changed functionality and all unit tests are successful
  • Customer-visible features documented
  • No linter warnings (make lint)

If CRDs are changed:

  • CRD YAMLs updated (make manifests) and also copied into the helm chart
  • Changes to CRDs documented

Please refer to the PR guidelines before submitting this pull request.

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

@prudhvigodithi
Copy link
Member

Thanks @samof76, adding @swoehrl-mw to please take a look and review.

@prudhvigodithi
Copy link
Member

prudhvigodithi commented Jan 9, 2025

Thanks for your contribution @samof76. I understood the motivation of this PR to have a generic sidecars, coming to the root problem of node-exporter, even though node-exporter when installed as daemon set which can query the OpenSearch pods via the cluster DNS there is some limitation to get the Pod-level (pod specific) metrics, just curious there should be some solution for this supported out of the box by prometheus, do we need to only run as a side car to solve this? I was reading some examples and saw https://github.com/lightstep/opentelemetry-prometheus-sidecar, https://docs.lightstep.com/docs/replace-prometheus-with-an-otel-collector-on-kubernetes, https://signoz.io/guides/how-to-monitor-custom-kubernetes-pod-metrics-using-prometheus/. I assume it should be possible to scrape the pod specific metrics.

Adding @eyenx @lukas-vlcek @getsaurabh02 @swoehrl-mw

@eyenx
Copy link

eyenx commented Jan 10, 2025

I am not sure this is how node exporter should be used. It's called node exporter for a reason. We'd rather have the application export proper metrics either directly or via the prometheus exporter.

@swoehrl-mw
Copy link
Collaborator

The ability to configure sidecars is one I can support. But I agree with @eyenx, node-exporter does not belong in the sidecar. Opensearch-specific metrics can be collected via the specific exporter that can be configured via the monitoring feature, other metrics should come from Kubernetes or node-level exporters.

@samof76
Please rework the example in the docs to have a meaningful example that has nothing to do with node-exporter or fluent-bit (both belong on the node level and I don't want that as a good example in the docs).
Also you did not copy the changed CRD yamls into the operator chart.

@samof76
Copy link
Author

samof76 commented Jan 28, 2025

@eyenx @swoehrl-mw

Since the pods run in the isolated network namespaces with all the container in that pod, it not easy to get those metrics from the node-exporter running as a daemonset on the node. So we run a node-exporter in all out high throughput pods to collect those tcp metrics with something like this...

apiVersion: apps/v1
kind: Deployment
metadata:
  name: foo
spec:
  ...
  ...
  template:
    ...
    ...
    spec:
      ...
      ...
      containers:
      - image: foo
        ...
        ...
      - image: node-exporter
        name: node-exporter
        args:
        - --web.listen-address=0.0.0.0:9100
        - --collector.disable-defaults
        - --web.disable-exporter-metrics
        - --collector.conntrack
        - --collector.filefd
        - --collector.netstat
        - --collector.sockstat
        ...
        ...
        securityContext:
          readOnlyRootFilesystem: true
          runAsNonRoot: true
          allowPrivilegeEscalation: false
          capabilities:
            drop: ["all"]
        ...
        ...

And it is the only way currently to get the pod's network namespace metrics. So it is critical for you to support this pattern as it will help in monitoring opensearch more efficiently.

@eyenx
Copy link

eyenx commented Jan 30, 2025

And it is the only way currently to get the pod's network namespace metrics.

I disagree, we use the kubelet metrics to gather metrics like

container_network_receive_bytes_total
container_network_transmit_bytes_total
...

This metrics are grabbed via cAdvisor from kubelet:

https://www.cloudforecast.io/blog/cadvisor-and-kubernetes-monitoring-guide/

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: No status
Development

Successfully merging this pull request may close these issues.

4 participants