-
Notifications
You must be signed in to change notification settings - Fork 211
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
add a drain timeout #351
add a drain timeout #351
Conversation
Awesome! Thanks @flbla . Can you fix the go test? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A few changes might be required, and this might conflict with an existing PR. It might be worth merging the two approaches.
} | ||
if err := kubectldrain.RunCordonOrUncordon(drainer, node, true); err != nil { | ||
log.Fatalf("Error cordonning %s: %v", nodename, err) | ||
} | ||
|
||
if err := kubectldrain.RunNodeDrain(drainer, nodename); err != nil { | ||
log.Fatalf("Error draining %s: %v", nodename, err) | ||
log.Error("Error draining %s: %v", nodename, err) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should probably be Errorf
if isDrained { | ||
invokeReboot(nodeID, rebootCommand) | ||
for { | ||
log.Infof("Waiting for reboot") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this should probably be log.Info
} else { | ||
log.Infof("Uncordon %s", node.GetName()) | ||
uncordon(client, node) | ||
deleteFlag := newCommand("/usr/bin/nsenter", "-m/proc/1/ns/mnt", "/bin/rm", rebootSentinelFile) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it would be best to use the available functions for wrapping things properly (so that it works on all OSes).
Next, what about the cases where the sentinel is a command instead of a sentinel file?
} | ||
if err := kubectldrain.RunCordonOrUncordon(drainer, node, true); err != nil { | ||
log.Fatalf("Error cordonning %s: %v", nodename, err) | ||
} | ||
|
||
if err := kubectldrain.RunNodeDrain(drainer, nodename); err != nil { | ||
log.Fatalf("Error draining %s: %v", nodename, err) | ||
log.Error("Error draining %s: %v", nodename, err) | ||
return false |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is another PR that grew a bit, and is touching this, maybe it's worth sharing efforts?
Can you have a look at https://github.com/weaveworks/kured/pull/341/files ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I maybe missed something, but I think it doens't uncordon the nodes after the timeout in the #341 ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Correct.
Here, I think error handling can be indeed improved. We can change the function signature to return the error and maybe the result.
If an error happens, log it. If forceReboot isn't true, we should probably uncordon and continue the main loop. Else, the control flow continues. That makes it far clearer to read, IMO.
The docs claim this change was put in as part of release 1.7.0. Is there an ETA? |
@fouadsemaan TBH, I am not sure this PR is necessary anymore. It gives extra details compared to a merged feature. Because the other feature was merged already, you might be interested by it @fouadsemaan . |
Hi, |
This PR was automatically considered stale due to lack of activity. Please refresh it and/or join our slack channels to highlight it, before it automatically closes (in 7 days). |
issue : #78
rebase on main branch instead of master (previous PR : #283)