-
-
Notifications
You must be signed in to change notification settings - Fork 39
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Runner not terminated after cancellation of job #537
Comments
That's strange. Do you have more insight on what the instance is doing? Did a job actually start on it? The fact that cancellation is being attempted suggests a job did start. Even if a job started, eventually the actions runner itself should time out as well (and then terminate the instance). Maybe it will have something useful in its logs once it does? And if a job wasn't started, the runner should be deleted by the idle reaper. At that point, the runner will stop on the instance and terminate itself. The only guess I have so far is the instance ran out of memory, started thrashing swap space, and therefore wasn't able to respond to GitHub server causing the cancellation request timing out. |
Good hint. Switched to an instance with more memory. I will add an additional alert to see when instances idle too long (e.g. for an hour). |
FYI #518 will cause SSM to terminate the instance instead of the instance terminating itself. That might help here too. |
Hey,
I do have the case that sometimes that a job is cancelled but the runner is not terminated.
I get a warning sign in the job in the ui:
The stepfunction and the runner itself still are
in progress
and think that there is a ongoing job available.It`s bad because the ec2s are still running without doing anything.
Does somebody else have this issue ?
Does anyone have a solution how to fix this behaviour ?
The text was updated successfully, but these errors were encountered: