Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The master node in cluster mode has failed, and the slave node cannot be upgraded to the master node #1216

Open
kingmayong opened this issue Jan 23, 2025 · 0 comments
Labels
bug Something isn't working

Comments

@kingmayong
Copy link

What version of redis operator are you using?

ghcr.io/ot-container-kit/redis-operator/redis-operator:v0.19.0

kubectl logs <_redis-operator_pod_name> -n <namespace>

{"level":"info","ts":"2025-01-23T03:01:19Z","msg":"Number of Redis nodes match desired","controller":"rediscluster","controllerGroup":"redis.redis.opstreelabs.in","controllerKind":"RedisCluster","RedisCluster":{"name":"rediscluster","namespace":"redis"},"namespace":"redis","name":"rediscluster","reconcileID":"8d709b50-98e2-41dd-b29a-a8641ce26039"}
{"level":"info","ts":"2025-01-23T03:01:19Z","msg":"Secret key found in the secret","controller":"rediscluster","controllerGroup":"redis.redis.opstreelabs.in","controllerKind":"RedisCluster","RedisCluster":{"name":"rediscluster","namespace":"redis"},"namespace":"redis","name":"rediscluster","reconcileID":"8d709b50-98e2-41dd-b29a-a8641ce26039","secretKey":"password"}
{"level":"info","ts":"2025-01-23T03:01:19Z","msg":"Secret key found in the secret","controller":"rediscluster","controllerGroup":"redis.redis.opstreelabs.in","controllerKind":"RedisCluster","RedisCluster":{"name":"rediscluster","namespace":"redis"},"namespace":"redis","name":"rediscluster","reconcileID":"8d709b50-98e2-41dd-b29a-a8641ce26039","secretKey":"password"}
{"level":"info","ts":"2025-01-23T03:01:19Z","msg":"Secret key found in the secret","controller":"rediscluster","controllerGroup":"redis.redis.opstreelabs.in","controllerKind":"RedisCluster","RedisCluster":{"name":"rediscluster","namespace":"redis"},"namespace":"redis","name":"rediscluster","reconcileID":"8d709b50-98e2-41dd-b29a-a8641ce26039","secretKey":"password"}
{"level":"info","ts":"2025-01-23T03:01:20Z","msg":"Secret key found in the secret","controller":"rediscluster","controllerGroup":"redis.redis.opstreelabs.in","controllerKind":"RedisCluster","RedisCluster":{"name":"rediscluster","namespace":"redis"},"namespace":"redis","name":"rediscluster","reconcileID":"8d709b50-98e2-41dd-b29a-a8641ce26039","secretKey":"password"}
{"level":"info","ts":"2025-01-23T03:01:20Z","msg":"Secret key found in the secret","controller":"rediscluster","controllerGroup":"redis.redis.opstreelabs.in","controllerKind":"RedisCluster","RedisCluster":{"name":"rediscluster","namespace":"redis"},"namespace":"redis","name":"rediscluster","reconcileID":"8d709b50-98e2-41dd-b29a-a8641ce26039","secretKey":"password"}
{"level":"info","ts":"2025-01-23T03:01:20Z","msg":"Secret key found in the secret","controller":"rediscluster","controllerGroup":"redis.redis.opstreelabs.in","controllerKind":"RedisCluster","RedisCluster":{"name":"rediscluster","namespace":"redis"},"namespace":"redis","name":"rediscluster","reconcileID":"8d709b50-98e2-41dd-b29a-a8641ce26039","secretKey":"password"}
{"level":"info","ts":"2025-01-23T03:01:21Z","msg":"Secret key found in the secret","controller":"rediscluster","controllerGroup":"redis.redis.opstreelabs.in","controllerKind":"RedisCluster","RedisCluster":{"name":"rediscluster","namespace":"redis"},"namespace":"redis","name":"rediscluster","reconcileID":"8d709b50-98e2-41dd-b29a-a8641ce26039","secretKey":"password"}
{"level":"info","ts":"2025-01-23T03:01:21Z","msg":"Secret key found in the secret","controller":"rediscluster","controllerGroup":"redis.redis.opstreelabs.in","controllerKind":"RedisCluster","RedisCluster":{"name":"rediscluster","namespace":"redis"},"namespace":"redis","name":"rediscluster","reconcileID":"8d709b50-98e2-41dd-b29a-a8641ce26039","secretKey":"password"}
{"level":"info","ts":"2025-01-23T03:01:22Z","msg":"Secret key found in the secret","controller":"rediscluster","controllerGroup":"redis.redis.opstreelabs.in","controllerKind":"RedisCluster","RedisCluster":{"name":"rediscluster","namespace":"redis"},"namespace":"redis","name":"rediscluster","reconcileID":"8d709b50-98e2-41dd-b29a-a8641ce26039","secretKey":"password"}

redis-operator version:

Does this issue reproduce with the latest release?

ghcr.io/ot-container-kit/redis-operator/redis-operator:v0.19.0

What operating system and processor architecture are you using (kubectl version)?

kubectl version Output
$ kubectl version

Client Version: v1.28.2
Kustomize Version: v5.0.4-0.20230601165947-6ce0bf390ce3
Server Version: v1.23.17
WARNING: version difference between client (1.28) and server (1.23) exceeds the supported minor version skew of +/-1

What did you do?
I deployed a Redis cluster in cluster mode and wrote a shell script to simulate a leader node failure in a dead loop, but I did not find that the follower node was upgraded to a leader

What did you expect to see?
I hope that in cluster mode, if the leader is abnormal, the follower can be switched to the leader

What did you see instead?

redisCluster:
name: "rediscluster"
clusterSize: 3
clusterVersion: v7
persistenceEnabled: true
image: quay.io/opstree/redis
tag: v7.2.6
imagePullPolicy: IfNotPresent
imagePullSecrets: {}
# - name: Secret with Registry credentials
redisSecret:
secretName: "rediscluster"
secretKey: "password"
resources:
requests:
cpu: 100m
memory: 128Mi
limits:
cpu: 500m
memory: 1024Mi
minReadySeconds: 0

-- Some fields of statefulset are immutable, such as volumeClaimTemplates.

When set to true, the operator will delete the statefulset and recreate it. Default is false.

recreateStatefulSetOnUpdateInvalid: false
leader:
replicas: 3
serviceType: ClusterIP
affinity: {}
# nodeAffinity:
# requiredDuringSchedulingIgnoredDuringExecution:
# nodeSelectorTerms:
# - matchExpressions:
# - key: disktype
# operator: In
# values:
# - ssd
tolerations: []
# - key: "key"
# operator: "Equal"
# value: "value"
# effect: "NoSchedule"
nodeSelector: null
# memory: medium
securityContext: {}
pdb:
enabled: false
maxUnavailable: 1
minAvailable: 1

follower:
replicas: 3
serviceType: ClusterIP
affinity: null
# nodeAffinity:
# requiredDuringSchedulingIgnoredDuringExecution:
# nodeSelectorTerms:
# - matchExpressions:
# - key: disktype
# operator: In
# values:
# - ssd
tolerations: []
# - key: "key"
# operator: "Equal"
# value: "value"
# effect: "NoSchedule"
nodeSelector: null
# memory: medium
securityContext: {}
pdb:
enabled: false
maxUnavailable: 1
minAvailable: 2

labels: {}

foo: bar

test: echo

externalConfig:
enabled: true
data: |
tcp-keepalive 400
slowlog-max-len 158
stream-node-max-bytes 2048
maxclients 10000
loglevel notice
logfile "/var/log/redis.log"
slowlog-log-slower-than 10000
slowlog-max-len 128
requirepass 'xxxxxxx'
externalService:
enabled: true

annotations:

foo: bar

serviceType: NodePort
port: 6379

serviceMonitor:
enabled: false
interval: 30s
scrapeTimeout: 10s
namespace: redis

-- extraLabels are added to the servicemonitor when enabled set to true

extraLabels: {}
# foo: bar
# team: devops

redisExporter:
enabled: true
image: quay.io/opstree/redis-exporter
tag: "v1.44.0"
imagePullPolicy: IfNotPresent
resources:
requests:
cpu: 100m
memory: 128Mi
limits:
cpu: 100m
memory: 128Mi
env: []

- name: VAR_NAME

value: "value1"

sidecars:
name: ""
image: ""
imagePullPolicy: "IfNotPresent"
resources:
limits:
cpu: "100m"
memory: "128Mi"
requests:
cpu: "50m"
memory: "64Mi"
env: {}
# - name: MY_ENV_VAR
# value: "my-env-var-value"

initContainer:
enabled: false
image: ""
imagePullPolicy: "IfNotPresent"
resources: {}
# requests:
# memory: "64Mi"
# cpu: "250m"
# limits:
# memory: "128Mi"
# cpu: "500m"
env: []
command: []
args: []

priorityClassName: ""

storageSpec:
volumeClaimTemplate:
spec:
storageClassName: nfs-client
accessModes: ["ReadWriteOnce"]
resources:
requests:
storage: 1Gi
nodeConfVolume: true
nodeConfVolumeClaimTemplate:
spec:
storageClassName: nfs-client
accessModes: ["ReadWriteOnce"]
resources:
requests:
storage: 1Gi

selector: {}

podSecurityContext:
runAsUser: 0
fsGroup: 0

serviceAccountName: redis-sa

TLS:
ca: ca.key
cert: tls.crt
key: tls.key
secret:
secretName: ""

acl:
secret:
secretName: ""

env: []

- name: VAR_NAME

value: "value1"

serviceAccountName: ""

@kingmayong kingmayong added the bug Something isn't working label Jan 23, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

1 participant