New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Sign up for GitHub

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jump to bottom

Reuse typechecking results - stage 1 #17898

Open

psfinaki wants to merge 2 commits into dotnet:main from psfinaki:reuse-typecheck-a

Member

psfinaki commented Oct 18, 2024 •

edited

Loading

First part of the implementation of reusing typecheck results, as per the design doc.

This does the following:

adds the flag --reusetypecheckingresults, with all the hooking
detects if there is a case to reuse typechecking results

Nothing happens - yet - we only report related activities though.

Cache invalidation scenarios I looked at:

changed sources
changed references
changed command line

Contributor

github-actions bot commented Oct 18, 2024 •

edited

Loading

❗ Release notes required

✅ Found changes and release notes in following paths:

Warning

No PR link found in some release notes, please consider adding it.

Change path Release notes path Description

src/Compiler docs/release-notes/.FSharp.Compiler.Service/9.0.200.md No current pull request URL (#17898) found, please consider adding it

psfinaki changed the title ~~Reuse typechecking results - stage 1~~ [WIP] Reuse typechecking results - stage 1

psfinaki commented

View reviewed changes

src/Compiler/Driver/GraphChecking/Graph.fs Outdated Show resolved Hide resolved

psfinaki commented

View reviewed changes

tests/benchmarks/FCSBenchmarks/CompilerServiceBenchmarks/ReuseTypecheckingResultsBenchmarks.fs Outdated Show resolved Hide resolved

psfinaki commented

View reviewed changes

src/Compiler/Driver/ParseAndCheckInputs.fs Show resolved Hide resolved

vzarytovskii reviewed

View reviewed changes

src/Compiler/Driver/ParseAndCheckInputs.fs Outdated Show resolved Hide resolved

psfinaki commented

View reviewed changes

src/Compiler/Driver/ParseAndCheckInputs.fs Outdated Show resolved Hide resolved

psfinaki commented

View reviewed changes

src/Compiler/Driver/fsc.fs Outdated Show resolved Hide resolved

psfinaki requested a review from T-Gro

October 18, 2024 16:31

Member Author

psfinaki commented Oct 18, 2024

@T-Gro PTAL, this is an early stage so I mostly just wanted to discuss the questions I raised in the comments, so that we go in the agreed direction.

T-Gro reviewed

View reviewed changes

src/Compiler/Service/BackgroundCompiler.fs Outdated Show resolved Hide resolved

T-Gro reviewed

View reviewed changes

vsintegration/src/FSharp.Editor/FSharp.Editor.resx Outdated Show resolved Hide resolved

T-Gro reviewed

View reviewed changes

src/Compiler/Driver/ParseAndCheckInputs.fs Outdated Show resolved Hide resolved

T-Gro reviewed

View reviewed changes

src/Compiler/Driver/ParseAndCheckInputs.fs Outdated Show resolved Hide resolved

Contributor

edgarfgp commented Oct 24, 2024

It seems like this is going to rely/build on top of graph checking have couple questions:

Would this PR Turn a bunch of options by default #17442 be included ?
If we are going to rely on this would be important to enable by default ?

Member Author

psfinaki commented Oct 25, 2024

@edgarfgp so we'll need to discuss this - personally I'd like to turn graph-checking on. Now (like, before Xmas) is a rather good time - we can catch some bugs there before the next big release.

It's somewhat scaring but, well, just this probably shouldn't be stopping us 👀 Graph-checking and reusing typechecking results can test each other and I'd say we should use this benefit here.

I am not speaking for the whole team just yet - only sharing some thoughts. We will discuss and prioritize this.

psfinaki force-pushed the reuse-typecheck-a branch from 6df7695 to be247f3 Compare

November 5, 2024 16:24

psfinaki marked this pull request as ready for review

November 6, 2024 14:00

psfinaki requested a review from a team as a code owner

November 6, 2024 14:00

psfinaki changed the title ~~[WIP] Reuse typechecking results - stage 1~~ Reuse typechecking results - stage 1

psfinaki force-pushed the reuse-typecheck-a branch 2 times, most recently from ce9c28e to 3275457 Compare

November 8, 2024 10:18

vzarytovskii requested changes

View reviewed changes

src/Compiler/Driver/ReuseTcResults/ReuseTcResultsDriver.fs

Comment on lines +80 to +83

+                  let getThisCompilationReferences =
+                      List.map (fun (r: AssemblyReference) -> r.Text, getContentHash r.Text)
+                      >> List.map (fun (name, hash) -> $"{name}: {hash}")
+                      >> List.toArray

Member

vzarytovskii Nov 11, 2024

That is A LOT of reading into memory, what's the memory consumption for the whole BCL and runtime assemblies? Can we check MVID and/or date? Also, in this case we gonna be reading them twice? One for reuse tc and in normal compilation. There should be some cache introduced for them if we'll have to read them twice.

Member Author

psfinaki Nov 11, 2024

We can do those things. Checking hash helps in the scenarios where one changes a dependency, reverts it immediately (ctrl + Z) and then triggers the recompilation. If we only look at the date, this will look like a new recompilation.

We can different processes for system and non-system dependencies, do you think it's worth it?

Member

vzarytovskii Nov 11, 2024

I think we should just straight up use MVID for assemblies. If it's good for deterministic builds and msbuild, it should be good enough for us. This way you'll only be reading header, and not calculate anything.

src/Compiler/Driver/ReuseTcResults/ReuseTcResultsDriver.fs

Comment on lines +86 to +118

+                      if FileSystem.DirectoryExistsShim tcDataFolderName then
+                          use _ = Activity.start Activity.Events.reuseTcResultsCachePresent []
+                          let prevCompilationCmdLine = getPrevCompilationCmdLine ()
+                          let thisCompilationCmdLine = getThisCompilationCmdLine tcConfig.cmdLineArgs
+                          if prevCompilationCmdLine = thisCompilationCmdLine then
+                              let prevCompilationReferences = getPrevCompilationReferences ()
+                              let thisCompilationReferences = getThisCompilationReferences tcConfig.referencedDLLs
+                              if prevCompilationReferences = thisCompilationReferences then
+                                  let prevCompilationGraph = getPrevCompilationGraph ()
+                                  let thisCompilationGraph = getThisCompilationGraph inputs
+                                  if prevCompilationGraph = thisCompilationGraph then
+                                      use _ = Activity.start Activity.Events.reuseTcResultsCacheHit []
+                                      () // do nothing, yet
+                                  else
+                                      use _ = Activity.start Activity.Events.reuseTcResultsCacheMissed []
+                                      writeThisCompilationGraph thisCompilationGraph
+                              else
+                                  use _ = Activity.start Activity.Events.reuseTcResultsCacheMissed []
+                                  writeThisCompilationReferences thisCompilationReferences
+                          else
+                              use _ = Activity.start Activity.Events.reuseTcResultsCacheMissed []
+                              writeThisCompilationCmdLine thisCompilationCmdLine
+                      else

Member

vzarytovskii Nov 11, 2024

Can cognitive load be reduced here somehow? Not to have a bunch of nested conditions?

src/Compiler/Driver/ReuseTcResults/ReuseTcResultsDriver.fs

+              let (++) a b = Path.Combine(a, b)
+              [<Sealed>]
+              type ReuseTcResultsDriver(tcConfig: TcConfig) =

Member

vzarytovskii Nov 11, 2024

Should probably be named differently, something along the lines of CachingDriver.

src/Compiler/Driver/ReuseTcResults/ReuseTcResultsDriver.fs

Comment on lines +43 to +45

+                  let writeThisCompilationCmdLine = writeThisTcData cmdLineFilePath
+                  let writeThisCompilationGraph = writeThisTcData graphFilePath
+                  let writeThisCompilationReferences = writeThisTcData referencesFilePath

Member

vzarytovskii Nov 11, 2024 •

edited

Loading

All writers should use buffer and flush it in the very end to avoid partial data being in the cached file.

src/Compiler/Driver/ReuseTcResults/ReuseTcResultsDriver.fs

Comment on lines +24 to +26

+                  let cmdLineFilePath = tcDataFolderName ++ "cmdline"
+                  let graphFilePath = tcDataFolderName ++ "graph"
+                  let referencesFilePath = tcDataFolderName ++ "references"

Member

vzarytovskii Nov 11, 2024

I don't think those should be separate files, to avoid partial and incorrect data, it should be as atomic as possible.

Member Author

psfinaki Nov 11, 2024

That was my first idea, but than I thought - granularity adds some flexibility here: if one of those things changes, we won't need to rewrite all of them, saving some time and memory. Multiple files are also a bit easier to work with - there is no need to set up some file structure here (what follows what, empty lines and so on).

What do you think?

Member

vzarytovskii Nov 11, 2024

It risks mixing things, I would put as we do with sigdata or assembly - all in one resource. It's fine rewriting all of them, it's not a lot of data, but it is easier to verify that all data is valid.

src/Compiler/Driver/ReuseTcResults/ReuseTcResultsDriver.fs

+                  let graphFilePath = tcDataFolderName ++ "graph"
+                  let referencesFilePath = tcDataFolderName ++ "references"
+                  let writeThisTcData path content =

Member

vzarytovskii Nov 11, 2024

Some sort of checksum mechanism should also be incorporated into the file, so it's immediately apparent that both format is correct as well as file is well written.

psfinaki force-pushed the reuse-typecheck-a branch 2 times, most recently from 478fde0 to 3f0d13c Compare

November 12, 2024 16:40

psfinaki added 2 commits

November 14, 2024 17:52


          Implementation

d77eea4


          update bsl

84bbc6a

psfinaki force-pushed the reuse-typecheck-a branch from 7cf4e2f to 84bbc6a Compare

November 14, 2024 16:52

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet