-
Notifications
You must be signed in to change notification settings - Fork 786
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Reuse typechecking results - stage 1 #17898
base: main
Are you sure you want to change the base?
Conversation
❗ Release notes required
Warning No PR link found in some release notes, please consider adding it.
|
tests/benchmarks/FCSBenchmarks/CompilerServiceBenchmarks/ReuseTypecheckingResultsBenchmarks.fs
Outdated
Show resolved
Hide resolved
@T-Gro PTAL, this is an early stage so I mostly just wanted to discuss the questions I raised in the comments, so that we go in the agreed direction. |
It seems like this is going to rely/build on top of
|
@edgarfgp so we'll need to discuss this - personally I'd like to turn graph-checking on. Now (like, before Xmas) is a rather good time - we can catch some bugs there before the next big release. It's somewhat scaring but, well, just this probably shouldn't be stopping us 👀 Graph-checking and reusing typechecking results can test each other and I'd say we should use this benefit here. I am not speaking for the whole team just yet - only sharing some thoughts. We will discuss and prioritize this. |
6df7695
to
be247f3
Compare
ce9c28e
to
3275457
Compare
let getThisCompilationReferences = | ||
List.map (fun (r: AssemblyReference) -> r.Text, getContentHash r.Text) | ||
>> List.map (fun (name, hash) -> $"{name}: {hash}") | ||
>> List.toArray |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That is A LOT of reading into memory, what's the memory consumption for the whole BCL and runtime assemblies? Can we check MVID and/or date? Also, in this case we gonna be reading them twice? One for reuse tc and in normal compilation. There should be some cache introduced for them if we'll have to read them twice.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can do those things. Checking hash helps in the scenarios where one changes a dependency, reverts it immediately (ctrl + Z) and then triggers the recompilation. If we only look at the date, this will look like a new recompilation.
We can different processes for system and non-system dependencies, do you think it's worth it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should just straight up use MVID for assemblies. If it's good for deterministic builds and msbuild, it should be good enough for us. This way you'll only be reading header, and not calculate anything.
if FileSystem.DirectoryExistsShim tcDataFolderName then | ||
use _ = Activity.start Activity.Events.reuseTcResultsCachePresent [] | ||
|
||
let prevCompilationCmdLine = getPrevCompilationCmdLine () | ||
let thisCompilationCmdLine = getThisCompilationCmdLine tcConfig.cmdLineArgs | ||
|
||
if prevCompilationCmdLine = thisCompilationCmdLine then | ||
|
||
let prevCompilationReferences = getPrevCompilationReferences () | ||
let thisCompilationReferences = getThisCompilationReferences tcConfig.referencedDLLs | ||
|
||
if prevCompilationReferences = thisCompilationReferences then | ||
|
||
let prevCompilationGraph = getPrevCompilationGraph () | ||
let thisCompilationGraph = getThisCompilationGraph inputs | ||
|
||
if prevCompilationGraph = thisCompilationGraph then | ||
use _ = Activity.start Activity.Events.reuseTcResultsCacheHit [] | ||
|
||
() // do nothing, yet | ||
else | ||
use _ = Activity.start Activity.Events.reuseTcResultsCacheMissed [] | ||
|
||
writeThisCompilationGraph thisCompilationGraph | ||
else | ||
use _ = Activity.start Activity.Events.reuseTcResultsCacheMissed [] | ||
|
||
writeThisCompilationReferences thisCompilationReferences | ||
else | ||
use _ = Activity.start Activity.Events.reuseTcResultsCacheMissed [] | ||
|
||
writeThisCompilationCmdLine thisCompilationCmdLine | ||
else |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can cognitive load be reduced here somehow? Not to have a bunch of nested conditions?
let (++) a b = Path.Combine(a, b) | ||
|
||
[<Sealed>] | ||
type ReuseTcResultsDriver(tcConfig: TcConfig) = |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should probably be named differently, something along the lines of CachingDriver
.
let writeThisCompilationCmdLine = writeThisTcData cmdLineFilePath | ||
let writeThisCompilationGraph = writeThisTcData graphFilePath | ||
let writeThisCompilationReferences = writeThisTcData referencesFilePath |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All writers should use buffer and flush it in the very end to avoid partial data being in the cached file.
let cmdLineFilePath = tcDataFolderName ++ "cmdline" | ||
let graphFilePath = tcDataFolderName ++ "graph" | ||
let referencesFilePath = tcDataFolderName ++ "references" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think those should be separate files, to avoid partial and incorrect data, it should be as atomic as possible.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That was my first idea, but than I thought - granularity adds some flexibility here: if one of those things changes, we won't need to rewrite all of them, saving some time and memory. Multiple files are also a bit easier to work with - there is no need to set up some file structure here (what follows what, empty lines and so on).
What do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It risks mixing things, I would put as we do with sigdata or assembly - all in one resource. It's fine rewriting all of them, it's not a lot of data, but it is easier to verify that all data is valid.
let graphFilePath = tcDataFolderName ++ "graph" | ||
let referencesFilePath = tcDataFolderName ++ "references" | ||
|
||
let writeThisTcData path content = |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some sort of checksum mechanism should also be incorporated into the file, so it's immediately apparent that both format is correct as well as file is well written.
478fde0
to
3f0d13c
Compare
7cf4e2f
to
84bbc6a
Compare
First part of the implementation of reusing typecheck results, as per the design doc.
This does the following:
--reusetypecheckingresults
, with all the hookingNothing happens - yet - we only report related activities though.
Cache invalidation scenarios I looked at: