Skip to content

Latest commit

 

History

History
42 lines (30 loc) · 2.5 KB

README.md

File metadata and controls

42 lines (30 loc) · 2.5 KB

AppViewLite

AppViewLite is an ATProto (Bluesky) appview focused on low resource consumption, able to run independently of the main appview APIs.

It includes:

  • A firehose listener and indexer (AppViewLite)
  • A simple web UI for viewing the indexed data (AppViewLite.Web)
  • An XRPC interface that allows you to reuse the official TypeScript client implementation

Screenshot of the bsky.app profile on AppViewLite

Indexing the firehose (posts, likes, reposts, follows, blocks) takes about 2.2 GB of disk space per day. By contrast, the raw data from the firehose (without inverse indexes) is reported to be around 200 GB per day.

If you try to access a post or profile that hasn't been indexed (because it was posted while the indexer wasn't running), it will be fetched from the PDS.

This AppView runs independently of the main bsky.app appview, except for CDN image delivery and handle resolution. Other than that, it only needs a relay.

Tip: You can browse to http://localhost:PORT/https://bsky.app/... to easily convert a bsky.app URL into an AppViewLite one.

Implementation status

Storage mechanism

Each "table" is a set of memory-mapped columnar storage files that associates one key, to one or many values. Both the keys and the values within a key are ordered to enable fast binary search lookups. All the slices of a table are periodically compacted into larger slices.

Identifiers

Accounts are rekeyed using 32-bit integers. RKeys are converted back into their underlying 64-bit values in order to save space.

Post text

Post data is compressed by turning it into GPT/Tiktoken tokens, then encoding the 18-bit tokens using a variable-length bit representation, and then serializing everything into a Protobuf message (along with other metadata), which is then Brotli-compressed (this was the most compact representation I could find after some experimentation).