Skip to content
This repository has been archived by the owner on Jan 30, 2025. It is now read-only.

Get DevTools URL from data dir file instead of by parsing stdout #563

Closed
wants to merge 1 commit into from

Conversation

imiric
Copy link
Contributor

@imiric imiric commented Oct 3, 2022

This is slightly more elegant than parsing stdout, but more importantly, shouldn't cause race conditions where we miss the "DevTools listening on" line, since it doesn't start a goroutine.

A drawback is that it could take slightly longer to return the URL because it's reading from disk and does several read attempts with delays, but in my tests the difference is negligible. In total, it allots 500ms to read the file, which should hopefully be plenty even on IO-starved machines (VMs, CI, etc.). IMO if this takes more time than that then it's likely the environment will not be fit for running a test. But we can always increase it if you prefer.

I didn't add tests for it, since pretty much all our tests check this functionality indirectly, but let me know if I should add some.

Also note that I haven't confirmed this definitely fixes grafana/k6#4425, since I wasn't able to reproduce the issue in WSL2 (though did run into some others... 😓), and the issue is very rare on Linux/EC2/Cloud, but I'm confident it does, and we can make a Cloud release and test this thoroughly to have more assurance.

This is slightly more elegant than parsing stdout, but more importantly,
shouldn't cause race conditions where we miss the "DevTools listening on" line,
since it doesn't start a goroutine.

A drawback is that it could take slightly longer to return the URL
because it's reading from disk and does several read attempts with delays,
but in my tests the difference is negligible. In total, it allots 500ms
to read the file, which should hopefully be plenty even on IO-starved
machines (VMs, CI, etc.). IMO if this takes more time than that then
it's likely the environment will not be fit for running a test.

Fixes #491
f *os.File
fpath = filepath.Join(dataDir, "DevToolsActivePort")
maxReadAttempts = 10
readAttemptDelayMS = 50
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, test-tip failed with panic: launching browser: getting DevTools URL: unable to read file "/tmp/xk6-browser-data-2308635430/DevToolsActivePort" in 500ms. 😞

I guess we could just increase it to 5s total to be safe (so 100 attempts)? Even 1 or 2s might be too low for some machines.

@mstoykov
Copy link
Contributor

mstoykov commented Oct 4, 2022

This is slightly more elegant than parsing stdout, but more importantly, shouldn't cause race conditions where we miss the "DevTools listening on" line, since it doesn't start a goroutine.

From this line I gather that currently it is thought that the reason why xk6-browser can't find the port to connect to is because:

  • the starting of the goroutine takes too long
  • and the line is already printed by the time the goroutine starts to read it?

@inancgumus
Copy link
Member

@imiric We can have the best of both worlds:

  • First, try to get the URL faster via stdout (the previous way)
  • Second, if the first one fails, get it from the file (current way)

This way, we can get the URL faster, and only if it fails can we try to get it from the file.

WDYT?

@imiric
Copy link
Contributor Author

imiric commented Oct 4, 2022

@mstoykov That's the theory, but it wasn't confirmed since this issue is so difficult to reproduce.

It could also be that it just takes more than 30s to start the browser process, which is very excessive, but not out of the question on VMs with bursty/unreliable IO like in CI or EC2. In that case, this won't help and we should increase the number of attempts/timeout, but I would still argue this approach is cleaner than parsing stdout.

@mstoykov
Copy link
Contributor

mstoykov commented Oct 4, 2022

That's the theory, but it wasn't confirmed since this issue is so difficult to reproduce.

Is there something else reading the output? Because if it isn't - there is no way that we can miss it. If anything the current code should continue to read the output as otherwise the buffer used to save it (in the OS) might overfill. This likely isn't a problem with the current usage where (from my observations) the browser does not output more then 3-4 lines at the very beginning of the execution.

So it seems to me that

It could also be that it just takes more than 30s to start the browser process
Is the more likely culprit?

I meant to open an issue ot log the actual output of the browser 🤔 Maybe that will help with diagnosing this?

@imiric
Copy link
Contributor Author

imiric commented Oct 4, 2022

@inancgumus I'd rather have one, preferably simpler, way of doing this.

Besides, I'm not certain whether the file is written before or after that line is printed to stdout, so reading the file might not make that big of a difference. At least it doesn't from my local tests. I'll try to confirm with strace or * shudders * reading the Chromium source.

var err error
f, err = os.Open(fpath) //nolint:gosec // false positive, https://github.com/securego/gosec/issues/439
if errors.Is(err, os.ErrNotExist) {
time.Sleep(time.Duration(readAttemptDelayMS) * time.Millisecond)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might we need to use a select statement here with time.Timer and also watch for the context being cancelled?

if err != nil {
return nil, err
}

wsURL, err := parseDevToolsURL(ctx, stdout)
wsURL, err := getDevToolsURL(dataDir.Dir)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the getDevToolsURL was called prior to calling execute (which will require some goroutine orchestration so that execute is only called once getDevToolsURL has started waiting for the file) then we should avoid any possible race conditions, no?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

True, that could be a solution as well. Though a) we're not 100% sure this is caused by a race, see the discussion above, and b) starting a goroutine with proper orchestration and parsing stdout is arguably messier than just reading a file.

@imiric
Copy link
Contributor Author

imiric commented Oct 4, 2022

According to strace, the file is written to immediately after DevTools listening on ... is written to stderr:

write(2, "\nDevTools listening on ws://127."..., 98) = 98
creat("/tmp/xk6-browser-data-3945058657/DevToolsActivePort", 0666) = 48
write(48, "40575\n/devtools/browser/fc99a07c"..., 60) = 60

Still, this approach could have a slight delay (~50ms), depending on when we succeed reading the file.

Judging by the feedback, you all prefer the stdout parsing approach, and we should drop this? Since a race is unlikely as stdout would be buffered, as Mihail says, then the only explanation is that it truly does take >30s to start the browser in some cases. That's out of our hands to fix here, but maybe we can run on more stable and better performing EC2 instances to minimize the chances of this happening.

@ankur22
Copy link
Collaborator

ankur22 commented Oct 4, 2022

then the only explanation is that it truly does take >30s to start the browser in some cases. That's out of our hands to fix here, but maybe we can run on more stable and better performing EC2 instances to minimize the chances of this happening.

I think it would be good to find a way to replicate this issue. Let me know if you need me to test any changes on my Windows laptop.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Timeout launching browser installed as a Snap package
4 participants