Migrate established targets from one node to another in cluster mode #573

Bapths · 2024-12-24T11:08:12Z

Hello,

Thank a lot for developing and maintaining such an useful tool!

I may be missing something but here is my concern:

Let's say that I have a cluster of 3 gNMIc nodes and I want to update the subscriptions (or perform an upgrade), when I will shut one node, I will loose every connected targets on this node for ~30s to 1min. Is there any way to force every active connections to close and reopen quickly on another one before doing anything that may have an impact?

Maybe including such a feature in the new API graceful shutdown endpoint? Or maybe there are already solutions to this issue that I am still missing?

Thanks for anyone that could help me to solve this issue 😄

karimra · 2025-01-20T04:03:53Z

in #579 I added 3 REST endpoints (all must be made to the cluster leader, other instances will throw an error)

Switch the cluster leader:
DELETE /api/v1/cluster/leader will make the leader release its lock to allow another instance to grab the leader lock.
Drain an instance:
POST /api/v1/members/{id}/drain where id is the instance name to be drained, the leader will move all the targets that instance is subscribed to to the other instances in the cluster. This is an async call, if you have a huge number of targets in that instanced it might takes sometime to drained, the API call will return while the targets will continue to be moved to other instances.
Rebalance the load between instances:
POST /api/v1/cluster/rebalance will rebalance the number of targets between the cluster instances if it's not balanced. This is also an async call it might take some time to happen.

So for your case you can run a drain on the instance you want to shutdown. Once that instance is back up, rebalance the cluster.

karimra added the enhancement New feature or request label Jan 6, 2025

karimra mentioned this issue Jan 20, 2025

add REST endpoints to switch the cluster leader and rebalance the instances load #579

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Migrate established targets from one node to another in cluster mode #573

Migrate established targets from one node to another in cluster mode #573

Bapths commented Dec 24, 2024

karimra commented Jan 20, 2025

Migrate established targets from one node to another in cluster mode #573

Migrate established targets from one node to another in cluster mode #573

Comments

Bapths commented Dec 24, 2024

karimra commented Jan 20, 2025