Data consistency for peer‐to‐peer indexers

Background

There are several scenarios when aura indexer can miss a transaction, which results in a inconsistency in data. We have a set mechanism for detecting and fixing such data gaps. Still, since aura can work as a distributed network of independent validators, we can using information from other peers to detect and fill data gaps.

General idea

To be able to efficiently identify missing data we:

Define a so called "epoch" - 10 000 slots, which (given the fact that Solana closes 1 slot in 400 ms) is a little bit more 1 hour. And also define "grand epoch" - 100 000 slots.
For each epoch we calculate a checksum as a hash of all records (can mean different thing depending on NFT protocol) that had been commited in this epoch. And we calculate a checksum of grand epoch as hash of epoch checksums that are included in this grand epoch.
Exchange grand epochs checksums with a peer.
If for a grand epoch a checksum differs, then exchange with the peer all the epochs from this grand epoch.
For all epochs that have checksum different from the peer's epoch checksum, exchange the epoch data with the peer.

flowchart LR
    We <--> Peer 
    subgraph We
        direction TB
        ge2-1[Grand Epoch 1] --> e2-1[Epoch 1]
        e2-1 --- s2-1[Slot 1]
        e2-1 --- s2-2[Slot 9]
        e2-1 --- s2-3[Slot 555]
        ge2-1 --- e2-2[Epoch 2]
        e2-2 --- s2-4[Slot 10001]
        e2-2 --- s2-5[Slot 10099]
    end
    subgraph Peer
        direction TB
        ge1-1[Grand Epoch 1] --> e1-1[Epoch 1]
        e1-1 --- s1-1[Slot 1]
        e1-1 --- s1-2[Slot 555]
        ge1-1 --- e1-2[Epoch 2]
        e1-2 --- s1-3[Slot 10001]
        e1-2 --- s1-4[Slot 10099]

    end

Protocol specific implementation

Metaplex provides different NFT protocols, and depending on the protocol there is a different type of additional data that we need to track along with the slot number.

Checksum for transaction based (Bubblegum)

With Bubblegum a collection of NFTs is stored in a single account in the form of a merkle tree, and each change of the tree results in a new signature.

This means that for Bubblegum, we need to track at least tree pubkey, slot number, seq and signature.

Introducing new RocksDB column families:

BubblegumChange: (epoch, tree_pubkey, slot, seq) => Signature
Leading epoch is needed for efficiency.
BubblegumEpoch: (tree_pubkey, epoch) => {INVALIDATED, CALCULATING, Checksum}
BubblegumGrandEpoch: (tree_pubkey, grand_epoch) => {INVALIDATED, CALCULATING, Checksum}

The connection between column families:

%%{init: {"flowchart": {"htmlLabels": false}} }%%
flowchart LR
    slot1["epoch 1
           pubkey 1
           slot 100
           seq
           signature"]
    slot2["epoch 1
           pubkey 1
           slot 347
           seq
           signature"]
    slot3["epoch 2
           pubkey 2
           slot 100
           seq
           signature"]
    slot4["epoch2
           pubkey 2
           slot 999
           seq
           signature"]
    slot5["epoch 3
           pubkey 1
           slot 12001
           seq
           signature"]
    
    epoch1["pubkey 1
            epoch 1
            checksum"]
    epoch2["pubkey 2
            epoch 1
            checksum"]
    epoch3["pubkey 1
            epoch 3
            checksum"]
    
    grandepoch1["pubkey 1
                 grand epoch 1
                 checksum"]
    grandepoch2["pubkey 2
                 grand epoch 1
                 checksum"]

    slot1 --> epoch1
    slot2 --> epoch1
    slot3 --> epoch2
    slot4 --> epoch2
    slot5 --> epoch3

    epoch1 --> grandepoch1
    epoch2 --> grandepoch2
    epoch3 --> grandepoch1

Tracking bubblegum changes, and calculating checksums:

Asset change record is made in the same batch as the writing of asset data.

With this algorithm, it any point of time we have either correctly calculated epoch checksum or empty/invalidated flag. There is no point in code where the application can crush and result into an inconsistency between change records and checksums.

Account based protocols

These are protocols that store each NFT in a separate accounts. For such protocols we need to track at least: slot number, account pubkey, write version. And unlike with transaction-based protocol, for account-based protocol we are not interested in all intermediate states, only in very last state.

The data model is following:

AccountChange - this column family is used as temporary storage for account changes. At the end of an epoch, changes are aggregated by account to find the last one, and removed from this column family after that.
(epoch,account_pubkey,slot_number, write_version) => ()
We don't partition account-based checksums by epoch, still, here we have epoch field just to keep the order convenient for the processing: when we iterate the column family and face a record with the epoch greater than we are calculating at the moment, then all the records after this one are also we the next epoch.
AccountState - stores information about very last account changes we are aware of
(account_pubkey) => {INVALIDATED | CALCULATING | (last_slot_number, last_write_version)}
AccountBuckets - stores checksum for account ranges from AccountState column family.
(bucket number) => {INVALIDATED, CALCULATING, Checksum}
AccountGrandBuckets - stores checksum for bucket ranges from AccountBuckets column family.
(grand bucket number) => {INVALIDATED, CALCULATING, Checksum}

%%{init: {"flowchart": {"htmlLabels": false}} }%%
flowchart LR
    temp_acc1_1["epoch 1
                 pubkey 1
                 slot 1
                 write_version 1"]
    temp_acc1_2["epoch 1
                 pubkey 1
                 slot 347
                 write_version 1"]
    temp_acc2_1["epoch 1
                 pubkey 2
                 slot 100
                 write_version 1"]
    temp_acc2_2["epoch 1
                 pubkey 2
                 slot 999
                 write_version 1"]
    temp_acc1_3["epoch2
                 pubkey 1
                 slot 12001
                 write_version 1"]
    
    acc1["pubkey 1
          last_slot 12001
          last_write_version 1"]
    acc2["pubkey 2
          last_slot 999
          last_write_version 1"]
    
    bucket1["bucket 1
             checksum"]
    bucket2["bucket 2
             checksum"]

    grand_bucket1["grand bucket 1
                   checksum"]
    grand_bucket2["grand bucket 2
                   checksum"]
    
    temp_acc1_1 --> acc1
    temp_acc1_2 --> acc1
    temp_acc2_1 --> acc2
    temp_acc2_2 --> acc2
    temp_acc1_3 --> acc1
    
    acc1 --> bucket1
    acc2 --> bucket2
    
    bucket1 --> grand_bucket1
    bucket2 --> grand_bucket2

Tracking account changes, and calculating checksums:

The main idea is that while processing accounts, we collect records about each account (pubkey, slot, write version) into a column family that serves as queue for changes (a kind of event sourcing). Than at the start of a new epoch (with acc based NFTs "epochs" are just time periods when the calculation is triggered, think of it as of message batch processing interval), we:

read changes, from the changes column family
group changes by accounts and identify which buckets should be invalidated
invalidate corresponding buckets
updated corresponding account in column family that keeps the last slot for each account
based on updated accounts status, recalculate buckets checksums
update buckets and grand buckets checksums
delete processed changes

It is reasonable to use first two bytes of an account pubkey as a bucket number (i.e. we'll have 65536 buckets), and the first byte as a grand bucket (1024 grand buckets).

Metada JSON consistency

In addition to on-chain data, we also want to keep metadata JSON consistent and up to data. The problem is that Metadata JSON exists outside of the chain, and can be updated in two ways:

NFT owner created another Metadata JSON with another URL and called the Metaplex program to update the URL
the URL stays the same, but the JSON text had been changed (but practice, but technically possible)

If the first scenario should be handled by the standard from (bubblegum/account processor), the second scenario is more complex. The only reliable way to detect such change is periodically re-download each metadata JSON and compare with the one in our storage.

The approach is very similar to account NFTs:

We create a column families for:
- Metadata JSON changes
- Metadata JSON checksums
- Metadata JSON buckets
Each time when a new metadata JSON is added, changed via smart contract, detected as a changed by re-downloader, the corresponding record is added to changes column family.
A background process reads all changes from previous period (day?), and updates Metadata JSON checksums.
After that, recalculates checksum for buckets and grand buckets.

Data exchange with trusted peer

sequenceDiagram
  We ->> Peer : request grand epochs
  activate Peer
  Peer ->> We : send grand epochs
  deactivate Peer

  We ->> We : filter grand epochs that differ

  loop grand epochs that differ
    We ->> Peer : request epochs
    activate Peer
    Peer ->> We : send epochs
    deactivate Peer

    We ->> We : filter epochs that differ

    loop epochs that differ
      We ->> Peer : request changes
      activate Peer
      Peer ->> We : send changes
      deactivate Peer

      We ->> We : process missing changes

      We ->> Peer : send changes missing on peer
    end

  end

Provide feedback

Saved searches

Use saved searches to filter your results more quickly