You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Thank a lot for developing and maintaining such an useful tool!
I may be missing something but here is my concern:
Let's say that I have a cluster of 3 gNMIc nodes and I want to update the subscriptions (or perform an upgrade), when I will shut one node, I will loose every connected targets on this node for ~30s to 1min. Is there any way to force every active connections to close and reopen quickly on another one before doing anything that may have an impact?
Maybe including such a feature in the new API graceful shutdown endpoint? Or maybe there are already solutions to this issue that I am still missing?
Thanks for anyone that could help me to solve this issue 😄
The text was updated successfully, but these errors were encountered:
in #579 I added 3 REST endpoints (all must be made to the cluster leader, other instances will throw an error)
Switch the cluster leader: DELETE /api/v1/cluster/leader will make the leader release its lock to allow another instance to grab the leader lock.
Drain an instance: POST /api/v1/members/{id}/drain where id is the instance name to be drained, the leader will move all the targets that instance is subscribed to to the other instances in the cluster. This is an async call, if you have a huge number of targets in that instanced it might takes sometime to drained, the API call will return while the targets will continue to be moved to other instances.
Rebalance the load between instances: POST /api/v1/cluster/rebalance will rebalance the number of targets between the cluster instances if it's not balanced. This is also an async call it might take some time to happen.
So for your case you can run a drain on the instance you want to shutdown. Once that instance is back up, rebalance the cluster.
Hello,
Thank a lot for developing and maintaining such an useful tool!
I may be missing something but here is my concern:
Let's say that I have a cluster of 3 gNMIc nodes and I want to update the subscriptions (or perform an upgrade), when I will shut one node, I will loose every connected targets on this node for ~30s to 1min. Is there any way to force every active connections to close and reopen quickly on another one before doing anything that may have an impact?
Maybe including such a feature in the new API graceful shutdown endpoint? Or maybe there are already solutions to this issue that I am still missing?
Thanks for anyone that could help me to solve this issue 😄
The text was updated successfully, but these errors were encountered: