-
Notifications
You must be signed in to change notification settings - Fork 70
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Minor improvements to the SRS store #321
base: main
Are you sure you want to change the base?
Conversation
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## main #321 +/- ##
============================================
+ Coverage 68.40% 70.32% +1.92%
+ Complexity 1185 1081 -104
============================================
Files 323 303 -20
Lines 13131 12433 -698
Branches 1319 1188 -131
============================================
- Hits 8982 8744 -238
+ Misses 3597 3194 -403
+ Partials 552 495 -57
*This pull request uses carry forward flags. Click here to find out more. |
prover/circuits/srs_store.go
Outdated
} | ||
|
||
type fsEntry struct { | ||
isCanonical bool | ||
size int | ||
path string | ||
org string |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what does "org" stand for ??
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"Organization": aleo, aztec, or celo
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not solely related to that but the structs would deserve some more doc
@@ -102,18 +106,52 @@ func (store *SRSStore) GetSRS(ctx context.Context, ccs constraint.ConstraintSyst | |||
sizeCanonical, sizeLagrange := plonk.SRSSize(ccs) | |||
curveID := fieldToCurve(ccs.Field()) | |||
|
|||
var lagrangeSRS kzg.SRS | |||
loadLagrangeErr := make(chan error, 2) | |||
go func() { // attempt to find lagrange SRS |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do we really win much by loading iin parallel? Makes sense to do it, but just curious, last I checked the readDump was pretty fast
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just a few seconds, mostly a win when debugging.
prover/circuits/srs_store.go
Outdated
return nil, nil, err | ||
} | ||
// save the lagrange SRS to disk |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
while it looks convenient for dev, I would not do it here, since in production many machine can launch and use a shared directory for the srs, it might create corrupted files.
i.e. if caller wants to serialize new files, he does it on its own, we should not do it on the prover runtime path.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see your point. Imo we could either
- Put a lock on the file being written so a prover instance would give up trying to write if it couldn't obtain the lock
- Let the function fail if the corresponding Lagrange was not found.
@AlexandreBelling what do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In prod, we don't touch the SRS. Do we? Or if we do, we only read so I'd assume this is fine
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The only downside I see for option 2 is that we would have to add a few more files to prover/prover-assets
for our tests to pass and I know we don't like committing too many files into the repo.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah but this is on prover runtime? Then no, we should not write this to disk. This makes the whole system more complicated than needed especially if it happens on the distributed file system in prod. We don't want to deal with the ramification of that.
The cleaner approach would be;
a. Try loading it
b. If it does not, try recomputing it locally but without writing it on disk
c. If not possible, throw a human-readable error explaining what is going on
The persons that are going to see issues related to finding the SRS are the SRE so better make clean error message as it saves time in case of failed deployment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay so definitely writing at prover runtime is off the table. No to option 1.
How about not generating locally at all? Since this is such a big performance penalty it should not be "swept under the rug" imo. We can set an upper bound, that up to say 1024 we can generate Lagrange on the fly (so that our tests still pass without storing more stuff on github) but that it also doesn't enable huge setup/prover delays.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Something you could do is to enable this writing-to-disk behaviour thanks to a global variable.
PR has had no activity for 30 days. What is blocking it? Is there anything you can do to help move it forward? Without action it will be closed in 7 days. |
@gbotrel @AlexandreBelling Removed disk writes. |
If the Lagrange SRS has to be computed, it will be saved (attributed to the same organization as the canonical SRS used to produce it)