-
Notifications
You must be signed in to change notification settings - Fork 379
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Replace pid with flock for runtime config loading #5435
base: main
Are you sure you want to change the base?
Conversation
Shouldn't we leave it as is and eventually remove the whole runtime config file instead? |
how do you mean? |
This pull request has merge conflicts that need to be resolved. |
b58069c
to
92be067
Compare
That appears to be a fairly intrusive change. This is a cheap and non-intrusive way to fix the specific problem at hand that easily can be backported. |
internal/pkg/flock/flock_windows.go
Outdated
return false, err | ||
} | ||
|
||
handle := windows.Handle(file.Fd()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Might be safer to go through the SyscallConn interface ...
conn, err := file.SyscallConn()
if err != nil {
return false, err
}
err = conn.Control(func(fd uintptr) {
handle := windows.Handle(fd)
...
})
...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, there is no reason to go through the SyscallConn interface. Lets keep it simple.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are you sure that this doesn't interfere with Go's GC and finalization stuff in any case? Moreover, I don't think we need to turn the file descriptor into blocking mode.
// create a temporary file for runtime config | ||
tmpfile, err := os.CreateTemp("", "runtime-config") | ||
require.NoError(t, err) | ||
defer os.Remove(tmpfile.Name()) | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is probably a remnant from a merge conflict resolution.
// create a temporary file for runtime config | |
tmpfile, err := os.CreateTemp("", "runtime-config") | |
require.NoError(t, err) | |
defer os.Remove(tmpfile.Name()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No. We now call dir.Init()
early, to ensure that the directory exists and have the correct permissions before trying to create the lock file. If we don't create a subdirectory (eg /tmp/runtime-config
) that the test process owns, it will try to change the permissions on the parent directory (eg /tmp
) which will result in a permission error.
So we need to create a directory we have permission to chmod
on for this test.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, are you talking about NewRuntimeConfig
? This test is about LoadRuntimeConfig
, which doesn't call dir.Init
and passes just fine without the above block. Moreover, tmpfile
is not used at all in the rest of the test.
This pull request has merge conflicts that need to be resolved. |
98d92f2
to
58b09ae
Compare
Use lock file and flock(2) to ensure there is only a single instance of k0s running. This is more reliable than storing the pid in the runtime config. This solves false positives with k0s runtime config leftovers. Fixes: k0sproject#5399 Signed-off-by: Natanael Copa <[email protected]>
@@ -128,7 +136,7 @@ func NewRuntimeConfig(k0sVars *CfgVars) (*RuntimeConfig, error) { | |||
Spec: &RuntimeConfigSpec{ | |||
NodeConfig: nodeConfig, | |||
K0sVars: k0sVars, | |||
Pid: os.Getpid(), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We might need to keep the PID until the next k0s minor release for backwards compatibility. Old versions will still try to use it to check for other instances.
return fmt.Errorf("failed to close the runtime config file: %w", err) | ||
} | ||
|
||
if err := os.Remove(r.lockFile.Name()); err != nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As we are relying on the lock file name to delete it, we should ensure that we're using an absolute path to open it. Might be worth an extra filepath.Abs
along with an explanatory comment just before calling tryLock
.
} | ||
|
||
// locked checks if the lock is currently held by another process. | ||
func locked(path string) bool { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This could be implemented in terms of tryLock
, for both UNIX and Windows:
func locked(path string) (bool, error) {
f, err := tryLock(path)
if err != nil {
if errors.Is(err, ErrK0sAlreadyRunning) {
return true, nil
}
return false, err
}
return false, f.Close()
}
`) | ||
require.NoError(t, os.WriteFile(rtConfigPath, content, 0644)) | ||
|
||
// try to load runtime config and check if it returns an error | ||
spec, err := LoadRuntimeConfig(rtConfigPath) | ||
assert.Nil(t, spec) | ||
assert.ErrorIs(t, err, ErrK0sNotRunning) | ||
t.Cleanup(func() { require.NoError(t, spec.Cleanup()) }) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wouldn't a call to spec.Cleanup
only be necessary after a successful call to NewRuntimeConfig
?
Moreover, since LoadRuntimeConfig
returned an error, there shouldn't be the need to call Cleanup
, especially as spec is nil
. So if the goal is to test that Cleanup
is a no-op when called on a nil
receiver, I'd rather do this in a separate test case.
@@ -86,4 +90,5 @@ func TestNewRuntimeConfig(t *testing.T) { | |||
_, err = NewRuntimeConfig(k0sVars) | |||
assert.Error(t, err) | |||
assert.ErrorIs(t, err, ErrK0sAlreadyRunning) | |||
t.Cleanup(func() { require.NoError(t, spec.Cleanup()) }) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd move this to line 82, so that it's clear that it's cleaning up the NewRuntimeConfig
call. Also, in test cleanup functions, there's no need to panic with require
(IIRC it is even problematic), so better use assert
instead of require
.
// create a temporary file for runtime config | ||
tmpfile, err := os.CreateTemp("", "runtime-config") | ||
require.NoError(t, err) | ||
defer os.Remove(tmpfile.Name()) | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, are you talking about NewRuntimeConfig
? This test is about LoadRuntimeConfig
, which doesn't call dir.Init
and passes just fine without the above block. Moreover, tmpfile
is not used at all in the rest of the test.
Use lock file and flock(2) to ensure there is only a single instance of k0s running. This is more reliable than storing the pid in the runtime config.
This also solves false positives with k0s runtime config leftovers.
Fixes: #5399
Description
Fixes #5399
Type of change
How Has This Been Tested?
Checklist: