-
Notifications
You must be signed in to change notification settings - Fork 785
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Compute columns in post-PeerDAS checkpoint sync #6760
base: unstable
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice and simple! Should have done this ages ago 😅
Just a TODO
comment to address, but we can probably start testing with this!
@michaelsproul I think this approach is simple and maintains same behavour as Deneb - but keen to hear if you have any preference or benefit of going with the alternatives discussed earlier (breaking the existing db invariant and populate the columns via backfill / p2p etc)?
Yeah it is a bit of a pain, this code is executed before network is initialised so we don't have the node ID yet, and I'm not sure if it's worth doing a big refactor for this. Why wouldn't blob pruning delete the extras? Looks like it iterate through all columns for the root, unless i'm looking at the wrong thing? lighthouse/beacon_node/store/src/hot_cold_store.rs Lines 2849 to 2859 in 8a6e3bf
@dapplion I think this should work - would you mind doing a quick test this locally before we merge? |
Oh I think the downside with this approach is that the checkpoint server would have to serve blobs, i.e. be a supernode. The latter approach above would allow syncing columns from peers instead, and checkpointz server doesn't have to store all columns (if we have 16mb blobs per block, that'd be ~2TB of blob data that the checkpointz server have to store). Or if we're ok with checkpointz server serving blobs, we could potentially introduce a |
This PR addresses #6026 |
@jimmygchen forgive my ignorance on the specs here (and indeed the existing implementation).. quick question. instead of storing the blobs on a checkpoint server, wouldn't it be easier to store the hash only such that the clients upon launch would p2p query / gossip those newly requested blobs? From my understanding, the objective of the checkpoint server is to allow a newly launched node to get going with minimal effort... but just like the state, LH still had to backfill... wouldn't a similar logic apply here? thx . |
Hi @mrabino1 Yes it is possible to download the blobs from peers, with the alternative @dapplion mentioned:
The proposed solution in this PR behaves similarly to Mainnet today - we download and store the checkpoint state, block and blobs - so this approach requires minimal changes in Lighthouse, and will likely work without additional effort - it may not be the final solution but could be useful for testing now. We could consider the alternative to download from peers - it means we'd have to break the current invariant of storing complete block and blobs in the database, and will need to have some extra logic to populate the finalized block when backfill completes. |
I think breaking the invariant that we have all blobs/columns for blocks probably makes sense, seeing as there are multiple cases now where this is useful. The other main use case is importing partial blob history, e.g. all txns related to some L2. This is being worked on in: I think it makes sense to restructure the DB storage as well so that we don't index by |
I think this PR is fine for now until we get our ducks in a row |
You are right, pruning does clean them up. |
Current DB indexes columns by [root:column_index], what do you propose to index with? |
+1, would be useful to be able to checkpoint sync on devnets, we could also test backfill. @dapplion do you want to get this tested and merged first? or consider the alternatives? |
Issue Addressed
Addresses #6026.
Post-PeerDAS the DB expects to have data columns for the finalized block.
Proposed Changes
Instead of forcing the user to submit the columns, this PR computes the columns from the blobs that we can already fetch from the checkpointz server or with the existing CLI options.
Note 1: (EDIT) Pruning concern addressed
Note 2: I have not tested this feature
Note 3: @michaelsproul an alternative I recall is to not require the blobs / columns at this point and expect backfill to populate the finalized block