Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

vector agent error: connection reset by peer #22501

Open
saarzur123 opened this issue Feb 24, 2025 · 0 comments
Open

vector agent error: connection reset by peer #22501

saarzur123 opened this issue Feb 24, 2025 · 0 comments
Labels
type: bug A code related bug.

Comments

@saarzur123
Copy link

A note for the community

  • Please vote on this issue by adding a 👍 reaction to the original issue to help the community and maintainers prioritize this request
  • If you are interested in working on this issue or have submitted a pull request, please leave a comment

Problem

Hello Vector Team,

Lately, we've been experiencing a high volume of errors from the Vector agent:

Error: "trying to connect: Connection reset by peer (os error 104)."

Our Setup:
Vector Agent: Running as a DaemonSet (one pod per node).
Vector Aggregator: Running as a StatefulSet.
Log Flow: The agent sends Kubernetes logs to the aggregator.
Clusters: We manage ~72 clusters, and the error occurs intermittently across different clusters or within the same cluster.

Key Observations:
None of the agent or aggregator pods have restarted.
No relevant errors appear in the logs.
No significant Kubernetes events have been reported.

We’d appreciate any insights on what could be causing these connection resets and how we can troubleshoot further.

Thanks in advance for your help!

Configuration

vector:
  customConfig:
    api:
      enabled: false
    data_dir: /vector-data-dir
    expire_metrics_secs: 900
    sinks:
      prom_exporter:
        address: 0.0.0.0:9598
        inputs:
        - internal_metrics
        type: prometheus_exporter
      vector_aggregator:
        address: apigw-edge-vector-aggregator.vector-aggregator.svc.cluster.local:7500
        inputs:
        - kubernetes_logs
        type: vector
    sources:
      internal_metrics:
        scrape_interval_secs: 60
        type: internal_metrics
      kubernetes_logs:
        glob_minimum_cooldown_ms: 1000
        max_read_bytes: 8192
        namespace_annotation_fields:
          namespace_labels: ""
        node_annotation_fields:
          node_labels: ""
        pod_annotation_fields:
          container_image_id: ""
          pod_annotations: ""
          pod_ips: ""
          pod_labels: ""
          pod_namespace: kubernetes.namespace_name
          pod_owner: ""
          pod_uid: ""
        type: kubernetes_logs
        use_apiserver_cache: true
  podLabels:
    sidecar.istio.io/inject: "true"
    vector.dev/exclude: "false"
  resources:
    limits:
      cpu: 1000m
      memory: 2Gi
    requests:
      cpu: 225m
      memory: 750Mi
  role: Agent
  rollWorkload: false
  tolerations:
  - effect: NoSchedule
    key: WorkGroup
    operator: Equal
    value: gwproxy
  - effect: NoExecute
    key: WorkGroup
    operator: Equal
    value: gwproxy

Version

0.44.0-distroless-libc

Debug Output


Example Data

No response

Additional Context

No response

References

No response

@saarzur123 saarzur123 added the type: bug A code related bug. label Feb 24, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type: bug A code related bug.
Projects
None yet
Development

No branches or pull requests

1 participant