Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[bitnami/redis] after election, Redis master attempts to replicate from itself #31617

Closed
chris-vest opened this issue Jan 28, 2025 · 3 comments
Closed
Assignees
Labels
redis solved stale 15 days without activity tech-issues The user has a technical issue about an application

Comments

@chris-vest
Copy link

chris-vest commented Jan 28, 2025

Name and Version

bitnami/redis 20.6.3

What architecture are you using?

amd64

What steps will reproduce the bug?

Set up Redis cluster using the given settings, and after an election, the master will try to sync from itself.

redis-test-seq-redis-sentinel-ue1-node-2 redis 1:S 28 Jan 2025 10:10:23.927 * Connecting to MASTER redis-test-seq-redis-sentinel-ue1-node-2.redis-test-seq-redis-sentinel-ue1-headless.redis-test-seq.svc.cluster.local:6379
redis-test-seq-redis-sentinel-ue1-node-2 redis 1:S 28 Jan 2025 10:10:23.927 * MASTER <-> REPLICA sync started
redis-test-seq-redis-sentinel-ue1-node-2 redis 1:S 28 Jan 2025 10:10:23.927 * Non blocking connect for SYNC fired the event.
redis-test-seq-redis-sentinel-ue1-node-2 redis 1:S 28 Jan 2025 10:10:23.927 * Master replied to PING, replication can continue...
redis-test-seq-redis-sentinel-ue1-node-2 redis 1:S 28 Jan 2025 10:10:23.927 * Trying a partial resynchronization (request 49feb8dee03e307c8cf19575a02be2228ec92403:32706362302).
redis-test-seq-redis-sentinel-ue1-node-2 redis 1:S 28 Jan 2025 10:10:23.927 * Master is currently unable to PSYNC but should be in the future: -NOMASTERLINK Can't SYNC while not connected with my master

As you can see, the pod redis-test-seq-redis-sentinel-ue1-node-2 tries to connect to itself via redis-test-seq-redis-sentinel-ue1-node-2.redis-test-seq-redis-sentinel-ue1-headless.redis-test-seq.svc.cluster.local:6379. This causes a number of issues, and often takes a long time and a lot of reelections before it resolves itself.

Are you using any custom parameters or values?

replica:
  externalMaster:
    enabled: true
    host: redis-sentinel-master.redis-test-seq.svc.cluster.local

sentinel:
  enabled: true
  externalMaster:
    enabled: true
    host: redis-sentinel-master.redis-test-seq.svc.cluster.local

architecture: "replication"

Also happens without replica.externalMaster.enabled=true:

sentinel:
  enabled: true
  externalMaster:
    enabled: true
    host: redis-sentinel-master.redis-test-seq.svc.cluster.local

architecture: "replication"

What is the expected behavior?

The newly elected master should not attempt to resync from itself after it has been elected.

What do you see instead?

The master attempts to sync from itself after it has been elected, which means applications using Sentinel cannot write to the master during this time. Usually 2 - 3 elections need to happen before it resolves, and in this time a lot of errors have occurred.

You can see the metrics here:

Image

After the third election, the newly elected master stops trying to sync from itself, and the cluster is now healthy again.

Additional information

No response

@chris-vest chris-vest added the tech-issues The user has a technical issue about an application label Jan 28, 2025
@github-actions github-actions bot added the triage Triage is needed label Jan 28, 2025
@chris-vest
Copy link
Author

We are able to avoid this by only setting the sentinel.externalMaster.enabled and sentinel.externalMaster.host upon bootstrapping, and then removing those values. This then allows for smooth elections between the connected Redis & Sentinel clusters, and the master does not attempt to sync from itself.

@github-actions github-actions bot removed the triage Triage is needed label Jan 30, 2025
@github-actions github-actions bot assigned migruiz4 and unassigned javsalgar Jan 30, 2025
Copy link

This Issue has been automatically marked as "stale" because it has not had recent activity (for 15 days). It will be closed if no further activity occurs. Thanks for the feedback.

@github-actions github-actions bot added the stale 15 days without activity label Feb 15, 2025
Copy link

Due to the lack of activity in the last 5 days since it was marked as "stale", we proceed to close this Issue. Do not hesitate to reopen it later if necessary.

@bitnami-bot bitnami-bot closed this as not planned Won't fix, can't repro, duplicate, stale Feb 20, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
redis solved stale 15 days without activity tech-issues The user has a technical issue about an application
Projects
None yet
Development

No branches or pull requests

4 participants