How to migrate from Repo v2→v3 #1748

zfaizal2 · 2023-10-14T17:05:49Z

zfaizal2
Oct 14, 2023

Hello bsky team,

We’re currently running an Atproto PDS which uses ‘repo-v2’ and is up to date with commit hash 913c912bc400b8f4f2976d1ddfd91ca271870408. We’re looking to migrate our databases to utilize repo-v3 and run an appview, indexer, and ingester in addition to the PDS.

We wanted to see what the is the best suggested path forward for both a data migration and schema migration. I see that there are PRs such as this one which assisted in upgrading repos to the new version #1479, but looks like the entire process may be spread across a few ‘migration’ PRs and some have since been removed, so wanted to see what the bsky team suggests to complete this migration.

Thanks!

cc'ing team members @viksit and @tj-dispatch

bnewbold · 2023-10-26T00:23:10Z

bnewbold
Oct 26, 2023
Maintainer

Unfortunately we don't have a happy answer for you. We don't yet support long-running PDS instances with documented migration paths. We still have a (small) handful of changes planned before federation (list here: #1711), and some of those may involve migrations and refactors which we don't document or support for other folks.

We do support running a version of our PDS in the sandbox environment, but the assumption there is that all data and identities are ephemeral and wiped periodically, instead of migrating forward.

This will all change at once when we get to federation, when a number of moving pieces will stabilize, and it will be crucial for other folks to run a PDS instance of arbitrary size in a reliable manner. We do intend to better document deployment and operation of the reference PDS implementation from that point on, including migrations.

That's the party line, but what can be done for your specific situation?

My gut feeling is that the best thing for you to do is to stick with what you are running now for "a few more months" (scare quotes) until the reference PDS is a stable and more-supported state. Then, do full repo exports (blobs and CAR files), run a v2 -> vN script on each individual repo CAR file if needed, and re-import everything in to a fresh PDS instance. Basically, migrate all the accounts to a "new" instance instead of upgrading in-place. If you need help with the CAR-to-CAR conversion script I can probably whip up a little golang CLI to help. Note that that the current PDS implementation doesn't actually implement the repo import endpoint, so this doesn't work today. It will work relatively soon to enable account migration.

Another approach would be some hybrid of attempting a few intermediate updates and migrations along the way. I don't think we have any internal documentation or even much memory of how we did all of the migrations at this point. v3 was much less disruptive than v2? But there was a big meta-migration when we collapsed all the previous database migrations.

As a heads up of a particularly disruptive upcoming change, we are likely going to drop support for postgresql for the reference PDS, and instead move to a data storage model where each repo has a separate sqlite file on disk. We are not planning to provide a clean migration path or script or documentation for the delicate cut-over this will require.

Another approach would be forking and backporting subsets of development to keep your PDS working. Probably a bunch of work!

1 reply

aetaric Oct 26, 2023

Speaking as an SRE and not as a developer for a moment, the change to sqlite isn't something I am a fan of. Two good reasons for using a full database server is being able to scale with an active/active setup and being able to setup failovers with active/passive replication.

It's certainly possible you do not expect to see a need for individual PDS servers to be scaled enough to need active/active. I suppose that's fine, but active/passive replication for failover is probably a pretty sane idea for data integrity's sake.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

How to migrate from Repo v2→v3 #1748

{{title}}

Replies: 1 comment 1 reply

{{title}}

{{title}}

Select a reply

How to migrate from Repo v2→v3 #1748

zfaizal2 Oct 14, 2023

Replies: 1 comment · 1 reply

bnewbold Oct 26, 2023 Maintainer

aetaric Oct 26, 2023

zfaizal2
Oct 14, 2023

Replies: 1 comment 1 reply

bnewbold
Oct 26, 2023
Maintainer