Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Minor improvements to the SRS store #321

Open
wants to merge 10 commits into
base: main
Choose a base branch
from
Open

Conversation

Tabaie
Copy link
Contributor

@Tabaie Tabaie commented Nov 21, 2024

  1. The store attempts to load the Lagrange SRS as it is loading the canonical
  2. Both are loaded directly from file instead of being loaded into bytes at first. This is doubly advantageous if the canonical SRS found is much larger than what we need.
  3. If the Lagrange SRS has to be computed, it will be saved (attributed to the same organization as the canonical SRS used to produce it)
  4. In-memory cache (useful when running multiple commands in the same run)

@codecov-commenter
Copy link

codecov-commenter commented Nov 21, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 70.32%. Comparing base (b1567ab) to head (ed6f37d).

Additional details and impacted files
@@             Coverage Diff              @@
##               main     #321      +/-   ##
============================================
+ Coverage     68.40%   70.32%   +1.92%     
+ Complexity     1185     1081     -104     
============================================
  Files           323      303      -20     
  Lines         13131    12433     -698     
  Branches       1319     1188     -131     
============================================
- Hits           8982     8744     -238     
+ Misses         3597     3194     -403     
+ Partials        552      495      -57     
Flag Coverage Δ *Carryforward flag
hardhat 98.52% <ø> (ø)
kotlin 67.99% <ø> (+1.94%) ⬆️ Carriedforward from 1af584c

*This pull request uses carry forward flags. Click here to find out more.

see 114 files with indirect coverage changes

@Tabaie Tabaie had a problem deploying to docker-build-and-e2e November 21, 2024 02:03 — with GitHub Actions Error
}

type fsEntry struct {
isCanonical bool
size int
path string
org string
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what does "org" stand for ??

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"Organization": aleo, aztec, or celo

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not solely related to that but the structs would deserve some more doc

@@ -102,18 +106,52 @@ func (store *SRSStore) GetSRS(ctx context.Context, ccs constraint.ConstraintSyst
sizeCanonical, sizeLagrange := plonk.SRSSize(ccs)
curveID := fieldToCurve(ccs.Field())

var lagrangeSRS kzg.SRS
loadLagrangeErr := make(chan error, 2)
go func() { // attempt to find lagrange SRS
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we really win much by loading iin parallel? Makes sense to do it, but just curious, last I checked the readDump was pretty fast

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a few seconds, mostly a win when debugging.

return nil, nil, err
}
// save the lagrange SRS to disk
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

while it looks convenient for dev, I would not do it here, since in production many machine can launch and use a shared directory for the srs, it might create corrupted files.

i.e. if caller wants to serialize new files, he does it on its own, we should not do it on the prover runtime path.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see your point. Imo we could either

  1. Put a lock on the file being written so a prover instance would give up trying to write if it couldn't obtain the lock
  2. Let the function fail if the corresponding Lagrange was not found.
    @AlexandreBelling what do you think?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In prod, we don't touch the SRS. Do we? Or if we do, we only read so I'd assume this is fine

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The only downside I see for option 2 is that we would have to add a few more files to prover/prover-assets for our tests to pass and I know we don't like committing too many files into the repo.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah but this is on prover runtime? Then no, we should not write this to disk. This makes the whole system more complicated than needed especially if it happens on the distributed file system in prod. We don't want to deal with the ramification of that.

The cleaner approach would be;

a. Try loading it
b. If it does not, try recomputing it locally but without writing it on disk
c. If not possible, throw a human-readable error explaining what is going on

The persons that are going to see issues related to finding the SRS are the SRE so better make clean error message as it saves time in case of failed deployment.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay so definitely writing at prover runtime is off the table. No to option 1.
How about not generating locally at all? Since this is such a big performance penalty it should not be "swept under the rug" imo. We can set an upper bound, that up to say 1024 we can generate Lagrange on the fly (so that our tests still pass without storing more stuff on github) but that it also doesn't enable huge setup/prover delays.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Something you could do is to enable this writing-to-disk behaviour thanks to a global variable.

@Tabaie Tabaie had a problem deploying to docker-build-and-e2e November 21, 2024 20:35 — with GitHub Actions Error
@Tabaie Tabaie had a problem deploying to docker-build-and-e2e November 22, 2024 19:28 — with GitHub Actions Error
@Tabaie Tabaie had a problem deploying to docker-build-and-e2e November 22, 2024 22:45 — with GitHub Actions Failure
Copy link

cla-assistant bot commented Jan 7, 2025

CLA assistant check
All committers have signed the CLA.

Copy link

github-actions bot commented Feb 7, 2025

PR has had no activity for 30 days. What is blocking it? Is there anything you can do to help move it forward? Without action it will be closed in 7 days.

@github-actions github-actions bot added the Stale label Feb 7, 2025
@Tabaie Tabaie had a problem deploying to docker-build-and-e2e February 11, 2025 19:09 — with GitHub Actions Error
@Tabaie Tabaie temporarily deployed to docker-build-and-e2e February 11, 2025 19:32 — with GitHub Actions Inactive
@Tabaie Tabaie requested a review from gbotrel February 11, 2025 19:37
@Tabaie
Copy link
Contributor Author

Tabaie commented Feb 11, 2025

@gbotrel @AlexandreBelling Removed disk writes.

@github-actions github-actions bot removed the Stale label Feb 12, 2025
@Tabaie Tabaie temporarily deployed to docker-build-and-e2e February 12, 2025 15:11 — with GitHub Actions Inactive
@Tabaie Tabaie had a problem deploying to docker-build-and-e2e February 18, 2025 16:27 — with GitHub Actions Error
@Tabaie Tabaie requested a deployment to docker-build-and-e2e February 19, 2025 15:49 — with GitHub Actions Waiting
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants