Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cluster leadership does not change when leader's backing disk freezes/becomes unresponsive #22064

Open
mackenzieATA opened this issue Jan 9, 2025 · 0 comments

Comments

@mackenzieATA
Copy link

Overview of the Issue

When a consul leader's disk freezes or become unresponsive leadership does not transition to the remaining instances in the cluster. It isn't detected as unhealthy.

Observed behaviour: Leadership remains unchanged but all writes to consul hang and eventually fail "agent.server: failed to wait for barrier: error="timed out enqueuing operation""

Expected behaviour: Leadership gets removed from the unhealthy instance and another instance in the cluster takes over and handles writes


Reproduction Steps

  1. Start up servers and configure
    1. I started three servers in AWS
    2. On one server attach an additional volume to be used as Consul's data-dir
      1. https://docs.aws.amazon.com/ebs/latest/userguide/ebs-using-volumes.html
  2. Install consul on servers
    1. https://developer.hashicorp.com/consul/downloads I used the Linux Package Manager instructions and installed v1.20.2
  3. Startup consul cluster
    1. You'll need the IPs of all 3 servers
    2. On the server with the additional volume:
      You must fill in <where you mounted the volume> <server IP> <other IP 1> and <other IP 2>
      consul agent -server -data-dir <where you mounted the volume> -node frozen_leader -advertise <server IP> -bind 0.0.0.0 -bootstrap-expect 3 -log-level INFO -retry-join <server IP> -retry-join <other IP 1> -retry-join <other IP 2>
    3. On the other two servers run a variation of this command:
      You must fill in <other server 1|2> <server IP> <other IP 1> and <other IP 2> (different for server 2 vs server 3)
      consul agent -server -data-dir /tmp -node <other server 1|2> -advertise <server IP> -bind 0.0.0.0 -bootstrap-expect 3 -log-level INFO -retry-join <server IP> -retry-join <other IP 1> -retry-join <other IP 2>
  4. Make the server with the extra volume the leader
    1. consul operator raft list-peers
    2. consul operator raft transfer-leader -id=<leader id>
  5. Freeze the data dir volume with fsfreeze
    1. sudo fsfreeze --freeze <path to mounted data directory>
  6. Observer that the leader stays the same, but any write operations, e.g. consul kv put key value hang and eventually fail

Consul info for both Client and Server

Client info

No Clients

Server info
agent:
	check_monitors = 0
	check_ttls = 0
	checks = 0
	services = 0
build:
	prerelease =
	revision = 33e5727a
	version = 1.20.2
	version_metadata =
consul:
	acl = disabled
	bootstrap = false
	known_datacenters = 1
	leader = true
	leader_addr = 10.16.0.186:8300
	server = true
raft:
	applied_index = 2406
	commit_index = 2406
	fsm_pending = 0
	last_contact = 0
	last_log_index = 2406
	last_log_term = 10
	last_snapshot_index = 0
	last_snapshot_term = 0
	latest_configuration = [{Suffrage:Voter ID:1a672c77-928b-0b19-b9e9-c3311459c561 Address:10.16.0.186:8300} {Suffrage:Voter ID:f2eebb9c-8728-c7a5-8832-2e14eed914f8 Address:10.16.0.150:8300} {Suffrage:Voter ID:28f90464-9f81-069f-cb0b-aed769935752 Address:10.16.0.242:8300}]
	latest_configuration_index = 0
	num_peers = 2
	protocol_version = 3
	protocol_version_max = 3
	protocol_version_min = 0
	snapshot_version_max = 1
	snapshot_version_min = 0
	state = Leader
	term = 10
runtime:
	arch = amd64
	cpu_count = 2
	goroutines = 218
	max_procs = 2
	os = linux
	version = go1.22.7
serf_lan:
	coordinate_resets = 0
	encrypted = false
	event_queue = 0
	event_time = 10
	failed = 0
	health_score = 0
	intent_queue = 0
	left = 0
	member_time = 8
	members = 3
	query_queue = 0
	query_time = 1
serf_wan:
	coordinate_resets = 0
	encrypted = false
	event_queue = 0
	event_time = 1
	failed = 0
	health_score = 0
	intent_queue = 0
	left = 0
	member_time = 6
	members = 3
	query_queue = 0
	query_time = 1

I compared leader vs follower consul info output, they're virtually identical so I'll just include the diff of one of them:

# diff Leader_consul_info Follower_consul_info
15c15
< 	leader = true
---
> 	leader = false
22c22
< 	last_contact = 0
---
> 	last_contact = 22.842754ms
35c35
< 	state = Leader
---
> 	state = Follower
40c40
< 	goroutines = 218
---
> 	goroutines = 144

For config see startup command in reproduction steps.

Operating system and Environment details

Ubuntu 22.04.05 LTS
x86_64

Log Fragments

I've included the logs seen during an attempted consul kv put my_data4 123 outside of that there's nothing that looks interesting
These logs occur at the end of the 30 second wait on kv put.

...
2025-01-09T22:51:28.941Z [TRACE] agent.server: rpc_server_call: method=Status.RaftStats errored=false request_type=read rpc_type=net/rpc leader=true
2025-01-09T22:51:29.727Z [TRACE] agent.server: rpc_server_call: method=KVS.Apply errored=true request_type=write rpc_type=net/rpc leader=true target_datacenter=dc1 locality=local
2025-01-09T22:51:29.727Z [ERROR] agent.http: Request error: method=PUT url=/v1/kv/my_data4 from=127.0.0.1:15924 error="raft apply failed: timed out enqueuing operation"
2025-01-09T22:51:29.727Z [DEBUG] agent.http: Request finished: method=PUT url=/v1/kv/my_data4 from=127.0.0.1:15924 latency=30.000357704s
2025-01-09T22:51:29.727Z [DEBUG] agent: warning: request content-type is not supported: request-path=/v1/kv/my_data4
2025-01-09T22:51:29.728Z [DEBUG] agent: warning: response content-type header not explicitly set.: request-path=/v1/kv/my_data4
2025-01-09T22:51:30.288Z [TRACE] agent.server.usage_metrics: Starting usage run
2025-01-09T22:51:30.289Z [TRACE] agent.server: rpc_server_call: method=Status.RaftStats errored=false request_type=read rpc_type=net/rpc leader=true
...
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant