-
Notifications
You must be signed in to change notification settings - Fork 11
Data consistency for peer‐to‐peer indexers
There are several scenarios when aura indexer can miss a transaction, which results in a inconsistency in data. We have a set mechanism for detecting and fixing such data gaps. Still, since aura can work as a distributed network of independent validators, we can using information from other peers to detect and fill data gaps.
To be able to efficiently identify missing data we:
- Define a so called "epoch" - 10 000 slots, which (given the fact that Solana closes 1 slot in 400 ms) is a little bit more 1 hour. And also define "grand epoch" - 100 000 slots.
- For each epoch we calculate a checksum as a hash of all records (can mean different thing depending on NFT protocol) that had been commited in this epoch. And we calculate a checksum of grand epoch as hash of epoch checksums that are included in this grand epoch.
- Exchange grand epochs checksums with a peer.
If for a grand epoch a checksum differs, then exchange with the peer all the epochs from this grand epoch.
For all epochs that have checksum different from the peer's epoch checksum, exchange the epoch data with the peer.
flowchart LR
We <--> Peer
subgraph We
direction TB
ge2-1[Grand Epoch 1] --> e2-1[Epoch 1]
e2-1 --- s2-1[Slot 1]
e2-1 --- s2-2[Slot 9]
e2-1 --- s2-3[Slot 555]
ge2-1 --- e2-2[Epoch 2]
e2-2 --- s2-4[Slot 10001]
e2-2 --- s2-5[Slot 10099]
end
subgraph Peer
direction TB
ge1-1[Grand Epoch 1] --> e1-1[Epoch 1]
e1-1 --- s1-1[Slot 1]
e1-1 --- s1-2[Slot 555]
ge1-1 --- e1-2[Epoch 2]
e1-2 --- s1-3[Slot 10001]
e1-2 --- s1-4[Slot 10099]
end
Metaplex provides different NFT protocols, and depending on the protocol there is a different type of additional data that we need to track along with the slot number.
With Bubblegum a collection of NFTs is stored in a single account in the form of a merkle tree, and each change of the tree results in a new signature.
This means that for Bubblegum, we need to track at least tree pubkey
, slot number
, seq
and signature
.
Introducing new RocksDB column families:
-
BubblegumChange:
(epoch, tree_pubkey, slot, seq)
=>Signature
Leading epoch is needed for efficiency. -
BubblegumEpoch:
(tree_pubkey, epoch)
=>{INVALIDATED, CALCULATING, Checksum}
-
BubblegumGrandEpoch:
(tree_pubkey, grand_epoch)
=>{INVALIDATED, CALCULATING, Checksum}
The connection between column families:
%%{init: {"flowchart": {"htmlLabels": false}} }%%
flowchart LR
slot1["epoch 1
pubkey 1
slot 100
seq
signature"]
slot2["epoch 1
pubkey 1
slot 347
seq
signature"]
slot3["epoch 2
pubkey 2
slot 100
seq
signature"]
slot4["epoch2
pubkey 2
slot 999
seq
signature"]
slot5["epoch 3
pubkey 1
slot 12001
seq
signature"]
epoch1["pubkey 1
epoch 1
checksum"]
epoch2["pubkey 2
epoch 1
checksum"]
epoch3["pubkey 1
epoch 3
checksum"]
grandepoch1["pubkey 1
grand epoch 1
checksum"]
grandepoch2["pubkey 2
grand epoch 1
checksum"]
slot1 --> epoch1
slot2 --> epoch1
slot3 --> epoch2
slot4 --> epoch2
slot5 --> epoch3
epoch1 --> grandepoch1
epoch2 --> grandepoch2
epoch3 --> grandepoch1
Tracking bubblegum changes, and calculating checksums:
Asset change record is made in the same batch as the writing of asset data.
With this algorithm, it any point of time we have either correctly calculated epoch checksum or empty/invalidated flag. There is no point in code where the application can crush and result into an inconsistency between change records and checksums.
These are protocols that store each NFT in a separate accounts. For such protocols we need to track at least: slot number, account pubkey, write version. And unlike with transaction-based protocol, for account-based protocol we are not interested in all intermediate states, only in very last state.
The data model is following:
-
AccountChange - this column family is used as temporary storage for account changes. At the end of an epoch, changes are aggregated by account to find the last one, and removed from this column family after that.
(epoch,account_pubkey,slot_number, write_version)
=>()
We don't partition account-based checksums by epoch, still, here we have epoch field just to keep the order convenient for the processing: when we iterate the column family and face a record with the epoch greater than we are calculating at the moment, then all the records after this one are also we the next epoch. -
AccountState - stores information about very last account changes we are aware of
(account_pubkey)
=>{INVALIDATED | CALCULATING | (last_slot_number, last_write_version)}
-
AccountBuckets - stores checksum for account ranges from AccountState column family.
(bucket number)
=>{INVALIDATED, CALCULATING, Checksum}
-
AccountGrandBuckets - stores checksum for bucket ranges from AccountBuckets column family.
(grand bucket number)
=>{INVALIDATED, CALCULATING, Checksum}
%%{init: {"flowchart": {"htmlLabels": false}} }%%
flowchart LR
temp_acc1_1["epoch 1
pubkey 1
slot 1
write_version 1"]
temp_acc1_2["epoch 1
pubkey 1
slot 347
write_version 1"]
temp_acc2_1["epoch 1
pubkey 2
slot 100
write_version 1"]
temp_acc2_2["epoch 1
pubkey 2
slot 999
write_version 1"]
temp_acc1_3["epoch2
pubkey 1
slot 12001
write_version 1"]
acc1["pubkey 1
last_slot 12001
last_write_version 1"]
acc2["pubkey 2
last_slot 999
last_write_version 1"]
bucket1["bucket 1
checksum"]
bucket2["bucket 2
checksum"]
grand_bucket1["grand bucket 1
checksum"]
grand_bucket2["grand bucket 2
checksum"]
temp_acc1_1 --> acc1
temp_acc1_2 --> acc1
temp_acc2_1 --> acc2
temp_acc2_2 --> acc2
temp_acc1_3 --> acc1
acc1 --> bucket1
acc2 --> bucket2
bucket1 --> grand_bucket1
bucket2 --> grand_bucket2
Tracking account changes, and calculating checksums:
The main idea is that while processing accounts, we collect records about each account (pubkey, slot, write version) into a column family that serves as queue for changes (a kind of event sourcing). Than at the start of a new epoch (with acc based NFTs "epochs" are just time periods when the calculation is triggered, think of it as of message batch processing interval), we:
- read changes, from the changes column family
- group changes by accounts and identify which buckets should be invalidated
- invalidate corresponding buckets
- updated corresponding account in column family that keeps the last slot for each account
- based on updated accounts status, recalculate buckets checksums
- update buckets and grand buckets checksums
- delete processed changes
It is reasonable to use first two bytes of an account pubkey as a bucket number (i.e. we'll have 65536 buckets), and the first byte as a grand bucket (1024 grand buckets).
In addition to on-chain data, we also want to keep metadata JSON consistent and up to data. The problem is that Metadata JSON exists outside of the chain, and can be updated in two ways:
- NFT owner created another Metadata JSON with another URL and called the Metaplex program to update the URL
- the URL stays the same, but the JSON text had been changed (but practice, but technically possible)
If the first scenario should be handled by the standard from (bubblegum/account processor), the second scenario is more complex. The only reliable way to detect such change is periodically re-download each metadata JSON and compare with the one in our storage.
The approach is very similar to account NFTs:
- We create a column families for:
- Metadata JSON changes
- Metadata JSON checksums
- Metadata JSON buckets
- Each time when a new metadata JSON is added, changed via smart contract, detected as a changed by re-downloader, the corresponding record is added to changes column family.
- A background process reads all changes from previous period (day?), and updates Metadata JSON checksums.
- After that, recalculates checksum for buckets and grand buckets.
sequenceDiagram
We ->> Peer : request grand epochs
activate Peer
Peer ->> We : send grand epochs
deactivate Peer
We ->> We : filter grand epochs that differ
loop grand epochs that differ
We ->> Peer : request epochs
activate Peer
Peer ->> We : send epochs
deactivate Peer
We ->> We : filter epochs that differ
loop epochs that differ
We ->> Peer : request changes
activate Peer
Peer ->> We : send changes
deactivate Peer
We ->> We : process missing changes
We ->> Peer : send changes missing on peer
end
end