-
-
Notifications
You must be signed in to change notification settings - Fork 279
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Test explorer stability: Use test results as main truth source for explorer #1874
Conversation
I realize this is a large merge. Is there something I can do to simplify the review process? |
I'll request a review from @baronfel. This will solve any problems 😈 |
release/README.md
Outdated
@@ -21,11 +21,13 @@ The LSP that powers language features is [FSAutoComplete](https://github.com/fsh | |||
|
|||
The library that powers project and script loading is [proj-info](https://github.com/ionide/proj-info) | |||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The README changes are an artifact of merging main, not actually related to my work. I can try to make them go away if desired
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great! I ran this against a few projects:
- IcedTasks
- Fantomas
These it work well for.
However there was one project it didn't seem to work well for:
It seemed to keep spawning processed that never ended, and used all CPU and RAM. I'm guessing it's related to any Promise.Parallel
and we should probably throttle these.
Also, seems like it didn't find all tests for FsToolkit.ErrorHandling
There are many more tests than this but some reason didn't pick them up.
src/Components/TestExplorer.fs
Outdated
let! _ = | ||
Project.getAll () | ||
|> List.map (fun p -> DotnetTest.dotnetTest p [||]) | ||
|> Promise.Parallel |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah these seems to be the culprit. However, turning this sequential seems to also be quite slow on FAKE since it's trying to get a trx file for every project. Maybe we need to be smarter here and somehow send from FSAC what we think the test projects are or some other way to determine it. Thoughts @baronfel?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes - FSAC should be able to help here quite a bit. Since this tooling now relies on the Test SDK (to power the dotnet test
integration, test list, etc) FSAC should learn how to tell you if a project references that package. Consider using Project.getInWorkspace()
instead (which also makes sure that you're only loading projects in the solution that the user has selected), which will give you a ProjectLoadingState seq
. From this you can filter to those items that are the ProjectLoadingState.Loaded
DU case, which has a Project
field - this Project
has the PackageReference
array , and you can look for a PackageReference
to Microsoft.NET.Test.Sdk
to get your answer.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good finds. I'll see what I can figure out for this and the missing FsToolkit.ErrorHandling tests.
I did consider several alternative approaches to test discovery. The two that seemed plausible
- A test project name pattern setting
- Build in project graph order (investigated, but I wasn't able to get that order easily).
Unfortunately, selecting test projects based on the code-discovered tests would miss test projects in other languages, unknown frameworks, or that only use Expecto test methods by proxy.
Trying dotnet test --list-tests
doesn't work because we're not guaranteed fully qualified test names (i.e. MSTest lists method name only). It probably wouldn't solve this issue anyway, since each project still needs to build before the tests can be listed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@baronfel That's a really good idea. I'll try to implement that today.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll see what I can figure out for this and the missing FsToolkit.ErrorHandling tests.
I did do a full rebuild and forced a refresh of the test list and it seemed to work at that point. I'm guessing it was failing on the generation of the trx
file since it uses --no-build
. I wonder if there's a way to report that we couldn't get a trx file out of a project so it's more clear to a user that something else needs to happen.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I experimented and think the missing tests and failure to load any tests are both related to parallel builds causing each other to error. I ran into some issues trying to fix this and it may take a bit to figure out.
First, I can't look for Microsoft.NET.Test.Sdk
because projects using Paket aren't listing their package dependencies, so checking for that dependency causes no tests to be found.
Second, I tried to build all projects in sequence then test in parallel, but I'm getting unclear build errors. Still trying to figure out why this is failing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@farlee2121 the comment about Paket shouldn't be the case - after a Restore has occured Paket still generates PackageReference MSBuild items that can be inspected and returned by FSAC. This is how the 'Package Dependencies' node in the solution explorer works for Paket-using projects. You can see this in action in obj/<projectName>.paket.props
in a given project, for example.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Aha! Restore was what I overlooked. I was filtering projects before restore was run.
I think initial test discovery is stable now. I'm still trying to test FAKE. Even loading the workspace is hard for my computer.
Analyzing package dependencies also opens up a new possibility for speeding up discovery. We can look for test libraries we know list full names with --lists-tests
to avoid a full test run. The issue is projects with multiple test libraries, though any quirks should resolve with the first test run.
FsToolkit.ErrorHandling revealed another overlooked issue: multiple target framework versions. This explorer is only showing test results for one version while dotnet test
and Visual Studio show a test item for each framework version.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Don't stress about TFM evaluation too strongly for this iteration - we have a number of issues already around the selected project context. For example, we always assume the DEBUG
configuration right now. TFM is just another instance of this. IIRC we use the 'first' TFM in a TFM list. You should feel free to keep this convention and we should work towards a better model all-up.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
An quick note about potential places for the TestResults state files.
Overall this is lovely work. I'm incredibly impressed with the thought and effort you put into this, and I agree with @TheAngryByrd that with a bit more testing and hammering out the last bits it will be a huge quality of life improvement for all Ionide users.
A bit longer term, if you're interested I would love to see if you have ideas for parts of this that could be pushed down into FSAC. Generally, FSAC is more easily usable from other editors like neovim, emacs, etc. - so if FSAC can handle the gnarly state management then it becomes easier for those editors to just sort of map those APIs to their local environments.
src/Components/TestExplorer.fs
Outdated
(ResizeArray( | ||
[| "test" | ||
projectPath | ||
"--logger:\"trx;LogFileName=Ionide.trx\"" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: consider writing this file somewhere inside Ionide's per-extension cache directory, which is part of the ExtensionContext
passed to the activate
call in fsharp.fs. Some docs on this directory are available here. This will keep the user's directory clean. I might also suggest a subdirectory in this storagePath
for test-related temp files, and another subdirectory in that for some kind of workspace identifier to prevent clobbering across multiple VScode instances?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for the kind words!
I can certainly checkout these improvements to the result file location. Your suggestions make a lot of sense. The results location is mostly an artifact of the previous implementation.
I'll also checkout moving code to FSAC, but it'll probably take a bit since I'll need to familiarize myself with FSAC.
3e791ea
to
dbc9754
Compare
Thanks for your input and patience. Here's a status update
|
If |
Test count mystery solvedI figured out why FsToolkit.ErrorHandling has a few tests that it only runs on certain targets (example). I'm only getting tests listed for the first target framework, while the trx approach gets whatever target framework writes the trx last. Outstanding work clarificationIt'd be good to clarify what needs to be done for a V1 of this test explorer. My understanding of outstanding work is
Anything I'm missing for outstanding work? |
@farlee2121 that task list sounds great to me - I agree with deferring the FSAC side of things. For the TargetFramework question, I think what I've seen other test explorers do is generate a different TRX for each TargetFramework, and then union them together. That to me creates a second question - how do we track which TFM(s) each test is for, so that we can trigger the correct TFM? I expect that the test lists information we're working with right now is not ready to also start 'pivoting' on TFM - that is perfectly fine. All of FSAC already doesn't 'pivot' on Debug/Release, or TFM, already. This just becomes one more place that we'd need to plumb through. I'm perfectly happy for this iteration for you to use the first TFM in each project - this is what we've been doing up till now, so at least the editor experience and the test list experience would be aligned. |
TFM investigations@baronfel Regarding how to track the TFM for a test and what other test explorers do, Visual Studio uses projects as the top level in the test explorer. If a project has multiple target frameworks, it'll create a new top-level tree item for each target framework. Organizing the test explorer by projects this way could allow a better test discovery experience for large projects, since we could progressively populate the explorer as each project finishes discovery. For now, I made a quick change to always run tests against the first TFM as suggested. Question on Window Isolation
Q: In light of this prior note, do we still want further window isolation for V1? List-only MergedI think my bitrot concerns are substantially reduced now that we test/list only one TFM at a time. |
So then the only outstanding question is Window Isolation? In that case I'm perfectly happy to go with what we get out of the box from VSCode, especially to prevent delaying this wonderful PR any further! I agree that TFM-pivoted projects would be a fine way to show tests in the future 👍 |
There could also still be some improvement reporting errors during discovery, like logging stdError when a restore, build, or test command fails. That wouldn't take me long. Should we go ahead and partition the tests by project (since that's the direction it looks like we'll head anyway)? I did a regression check, and I do think the code is mergeable as is. It behaves consistently on several projects, including large ones, and the resource starvation issues are resolved. |
Yes, this is reasonable - and please include the selected TFM in that tree node so that users at least know what TFM is being executed currently. Then I think we're good to go with this. |
Paves the way for multiple Target frameworks and provides context so users can figure out why the test count is different than the console or other test explorers See this discussion: ionide#1874 (comment)
Paves the way for multiple Target frameworks and provides context so users can figure out why the test count is different than the console or other test explorers See this discussion: ionide#1874 (comment)
07c47f3
to
02de3f9
Compare
Ah - yes the Restore issues aren't your responsibility. It's fair to kick out to the user at that point. FSAC is supposed to be doing some restore magic implicitly, but I'm not sure how well that works. Regarding the split, this sounds really familiar. I think there was a similar issue with Expecto generally and splitting. We might have to do trx-based discovery to really solve that... |
I remember @PaigeM89 doing a lot of work https://github.com/BinaryDefense/BinaryDefense.Junit.Expecto.TestLogger. I wonder how much would apply here. |
Detect build failures and when a test run likely failed to complete
@TheAngryByrd Thanks for testing. Sounds like I need to add more repos to my testing.
Code locations can't be matched when the code location hierarchy doesn't match the test name hierarchy. For example, if tests are referenced by value. let cancellableTaskResultTests =
testList "CancellableTaskResult" [
cancellableTaskResultBuilderTests // <- this kind of situation
functionTests
] As for both showing up, in my testing, sometimes running the tests replaces a renamed It might have to do with a known error when merging code-sourced updates. Renaming tests where the code location doesn't match the test name hierarchy (like we have here) causes them to get added to the test explorer incorrectly, because it looks like an added test when really it's a mislocated test. I decided to leave this issue because it doesn't have a clear solution that doesn't kill live renaming and it seemed a relatively uncommon scenario. Maybe that was a bad call.
That error sounds like it doesn't like the filter expression for some reason. I'll have to dig.
We might be able to modify
I think I know what's going wrong here. Might be another version of too much parallelization. I'll poke and see if I can fix it. |
Well, I tried using the separate ClassName and TestName fields from the trx file to partially solve the My next options for addressing this issue are unclear or time-intensive partial solutions. I'm going to move on to issues uncovered by Jimmy before pursuing this issue further. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Gonna mark this as request changes until you think it's good for another review
Prevents performance issues experienced with a large number of test projects.
Parallel project builds can conflict if they have mutual dependencies. This was more prominent during discovery and already fixed there, but it can also happen when running tests. Linearizing the build and limiting parallelism also changed the test status UX. Explicitly showing all enqueued right away makes it more intuitive that that the explorer responded correctly to the user's run request
Pass cancel tokens through the test, build, and refresh call chains. Attempt to kill any long-running processes spawned (i.e. dotnet test, msbuild) Unfortunately, secondary processes (like testhost) aren't cancelling when their parent does.
5e46a88
to
6567b12
Compare
Used to for workarounds specific to NUnit.
NUnit can't handle spaces in test filter expressions. The mitigation for it depends on knowing the test is an NUnit test. However, that information was only available if code-sourced test info is merged after code discovery. This makes that info available from trx discovery too.
@TheAngryByrd I looked into the most recent batch of issues
I limited the max parallel test projects to reduce chance of resource starvation and adjusted the test tree item status management to keep the experience intuitive.
I added cancellation tokens and killing processes based on them, but the results are less the desired. The It may work better on Linux, since the "SIGINT" signal should cause it to terminate as an interrupt, but windows doesn't handle the signal. We could possibly terminate testhost by name, but that feels dangerous. Maybe someone else knows of a smarter way to handle the process termination.
I investigated. The reason this is so slow is because YoloDev.Expecto.TestSdk doesn't respect the filter expression, so it tries to run all of the tests. This particular set of tests is large and slow. The tree item selected in your screen shot won't show a result status when it does eventually finish because it's a case of a mislocated test (where the code structure doesn't match the test structure so it creates an erroneous tree item that can't be matched up with test results). The correctly nested item for that test does show status correctly after the run finishes. Fixing this slowness would require changes to the YoloDev adapter.
This error is because NUnit can't handle spaces in filter expressions. Filtering on the part of the name before the first space creates the expected user experience, but it might end up running multiple tests. This shouldn't be slow so long as the first part of the test name isn't shared by a large number of tests. Intent to defer test name-based issuesI don't plan to look deeper into issues with name separators or matching locations for tests with different code location vs test name hierarchies. These issues are complex (even VS struggles with them). The previous version of the ionide test explorer couldn't run these tests either for the same reasons (it still had to match up results from the trx, but couldn't if the hierarchy was off). There's a lot that's improved or fixed in this version and (I don't think) any new broken behaviors. I propose that the updates are worth merging even with these known issues. Thoughts? |
I agree with your analysis @farlee2121 - thank you for the investigation into the specific issues with different test SDKs. I think we're in a place where we can review the latest batch of changes and then merge if all is well! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This works so well on Fantomas's repo!
@@ -1707,6 +1707,12 @@ | |||
} | |||
] | |||
}, | |||
"viewsWelcome": [ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a really great idea!
@@ -24,6 +24,79 @@ module MSBuild = | |||
| Ok msbuild -> Promise.lift msbuild | |||
| Error msg -> Promise.reject (exn msg)) | |||
|
|||
let invokeMSBuildWithCancel project target (cancellationToken: CancellationToken) = |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: for follow-up we should see if invokeMSBuildWithCancel
and invokeMSBuild
can be unfied.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah yes. I forgot to follow up with that.
Opening a PR
EDIT: the PR
@TheAngryByrd any concerns? or can we clear your change request and merge? |
It was a placeholder and @farlee2121 has requested another review
Thanks a bunch to @baronfel and @TheAngryByrd for the reviews and testing! |
@farlee2121 thanks for this! Is there a way to disable this? This seems to be pretty slow on one of my work projects. |
@TheAngryByrd It can't be disabled. |
We have some interesting msbuild steps that may cause long rebuilds and so it can basically sit there for minutes trying to run For instance we use websharper, they have this "booster" process that gets called out to from msbuild. While it works fine, it seems to be extremely slow with concurrent builds (such as running a build from CLI and the test discovery build). |
Ah, so the auto-discovery is the problem. That should be pretty easy to create a setting to turn off |
PR for the setting created. In other news, the Expecto test adapter recently added support for filter expressions. That should speed up small test group runs on large expecto-based projects I also think I found some ways forward on mismatched code-vs-name test locations. |
WHAT
Improve test explorer stability and more reliably reflect structure of the tests independent of code structure.
This means we also support tests that we can't locate in code. For example, tests in C#, written with MS Test, or user-written wrappers of expecto test methods.
HOW
Change the test explorer to base it's hierarchy off of test results, then merge changes from code updates
and test results.
WHY
#1869
I was experiencing several issues with the test explorer
There were also some stability issues
Overview Of Changes
For a detailed list of behaviors, see my regression test checklist in my sample repo for testing the explorer
Tree structure reflects test results, not code structure. This solves issues like wrapped test methods, libraries without code adapter support, and C# tests.
Live rename, add, and remove from code updates is stable
updates-from-code.mp4
Test list updates to reflect run results. This is especially helpful for tests without code locations, since they don't update as code changes. Instead, they can update from partial test runs and the user doesn't have to reload the whole test explorer.
tests-update-on-run.mp4
I also added console output. This improves troubleshooting and failure message discoverability for tests without code locations.
Though you can also see errors for any test in the results view by clicking on the message under the test
I also fixed Run All never finishing
test-run-never-finishing-fixed.mp4
And I added a refresh handler so users can fully rediscover tests if something goes wrong.
Future?
There are certainly more improvements that could be made, but it seems like the work is at a stable and useful point worth PRing.
A few prospective future tasks