Some notes about trying to understand how syncthing works by trying to implement parts of it in rust.
- [X] Set up protocol negotiation for bep/1.0
- [X] Send a
Hello
packet back - [X] Verify peer’s certificate
- [X] Refactor/clean-up code as it’s getting messy… In progress....
- [X] Keep the connection open and handle incoming messages
- [X] Try switching to
prost
as protobuf implementation - [X] properly handle connection shutdown
- [ ] discover our IP address(es) instead of hardcoding it
- [X] Start putting stuff in a config file (eg. peers that we know about)
- [ ] Start implementing the Model
- [ ] Global discovery
- [ ] File indexing, db, file synchronization…
- [ ] QUIC support?
- [ ] IPv6 support?
- [ ] REST api?
A key pair is generated the first time: private key + public key (stored as an X509 certificate). The key pair is used to do a TLS handshake with other peers.
The keys are stored in PEM format (basically ascii-armored way to attach binary data): base64 data between delimiters:
-----BEGIN FOO----- <data goes here> -----END FOO-----
The binary data encoded in the PEM file is the certificate encoded with ASN.1 and serialized with DER.
ASN.1 defines a way to encode data in data structures i.e. defines a grammar for a set of types, with their fields.
The general structure of a certificate, encoded as ASN.1, is as follows:
Certificate Version Number Serial Number Signature Algorithm ID Issuer Name Validity period Not Before Not After Subject name Subject Public Key Info Public Key Algorithm Subject Public Key Issuer Unique Identifier (optional) Subject Unique Identifier (optional) Extensions (optional) ... Certificate Signature Algorithm Certificate Signature
DER
is a type-length-value way to serialize data such as ASN.1 into bytes.
- SHA-256 hash of the certificate (the DER-encoded binary data)
- base-32 encoded
- in recent versions the device ID contains 4 checksum digits (1 every 13 characters), so the length of the ID is 56 instead of 52. See Luhn’s algorithm.
- ’-’ are inserted every 7 characters for readability purposed
- Listens for UDP messages broadcasted on port 21027. Registers the announced devices in a cache.
- The cache is used to resolve device ids
- every 30-60s broadcast an
Announce
packet on the network
- Try to resolve the device id to get addresses to connect to, by looking through different “providers” eg. config file, local discovery, global discovery, etc.
- Connect to the address with TLS while presenting our own certificate
- exchange
Hello
packet - Verify that the remote certificate (there should be only one) matches the device ID we expect.
- send
ClusterConfig
message: This contains a list of folders to synchronize with the peer. The list is taken from the config file. - For each folder sent in the
ClusterConfig
message, send anIndex
message with the list of files that we have. The peer will do the same.
- a
service
launches a bunch of goroutines:- one to listen for incoming connections
- once the connection is established, TLS handshaked done, identity verified, a new
connection
is created and registered
- once the connection is established, TLS handshaked done, identity verified, a new
- a dialer that loops and tries to connect to configured peers
- one to listen for incoming connections
- when a
connection
is registered and started, it launches a bunch of goroutines:- one that reads messages and sends them to the
dispatcher
- a
dispatcher
that handles the different messages (e.g.ClusterConfig
,Index
,Close
, etc…) - one that writes messages to the socket
- a ping sender
- a ping receiver
- one that reads messages and sends them to the
- Use Tokio for the async stuff…
- Use tokio-tls (and in particular tokio_rustls) for TLS stuff: wraps rustls in an async API and integrates with tokio
- [X] Use
tracing
for logging - Can
tokio-tower
be useful here?
When connecting 2 peers, they each offer their certificates / key as part of the
TLS handshake, but it does not do client authentication (as the certificates
are self-signed). Unfortunately, rustls
doesn’t seem to support this scenario
out of the box…
NoClientAuth
doesn’t do client authentication, but doesn’t request the peer certificateAllowAnyAuthenticatedClient
andAllowAnyAnonymousOrAuthenticatedClient
do request the peer certificate, but try to validate it (and fail)
The workaround was to implement a custom ClientCertVerifier
that requests the
peer certificate but doesn’t do authentication (or rather, accepts
everything…). It required enabling the dangerous_configuration
feature on the
rustls
crate to access the ClientCertVerifier
trait.
Not quite sure yet how to handle the lifecycle of all the pieces… When we get
the ctrl-c and exit the select
block, we drop the different the services
which should close the different sockets properly. We might need something more
sophisticated to close all the open connections.
Ended up using a `Watch<bool>` from Tokio. Each async task gets a Receiver
,
which they `await` on. When it’s time to shutdown, the main thread broadcasts
true
on the channel, which causes all tasks to finish up and drop their
Receiver
. Meanwhile the main threads await=s on the =Sender
for all the
=Receiver=s to be dropped.
In the go
version, when a connection is “started”, a bunch of goroutines are started:
- one to listen for incoming TCP messages, deserialize them, and send them to an
inbox
channel - one to listen for outgoing messages on an
outbox
channel and send them on the wire to the peer - one to listen for incoming messages on the
inbox
channel and dispatch them to other services - One to listen to
ping
messages and timeout the connection if we haven’t received one for more than x seconds - one that sends pings if we haven’t sent a message for more than x seconds
- one that listens on the
close
channel and closes the connection if needed
This blog post https://ryhl.io/blog/actors-with-tokio/ should be very useful to structure the code.
I tried applying the above principles and ended up with:
- a
ConnectionReader
task: listens on theReader
half of the socket, deserializes incoming messages, and sends them on the inbox channel; - a
ConnectionDispatcher
task: pulls messages from the inbox, reacts to them, and optionally sends another message to the outbox; - a
ConnectionWriter
task: pulls messages from the outbox, serializes them and writes them to theWriter
half of the socket; - a
ConnectionPingReceiver
task: timeout the connection if no message has been received in the last x seconds - a
ConnectionPingSender
task: sends aPing
message if no message has been sent for more than x seconds
All these tasks are managed by a single ConnectionHandle
.
One thing that caught me by surprise: protobuf has a concept of default values and it optimizes away those values when sending a message over the wire.
In the case of Syncthing messages, the first message sent after authentication
is a ConfigCluster
message. Its message type is 0
which is the default
value, and if no compression is used, the compression type is also the default
value. Similarly, if the cluster configuration doesn’t have any folder, which is
the default value. So both the header and the actual messages are effectively
empty. All that is left are the header length and message length that are both 0
i.e the entire packet consists of 6 null bytes: 0x000000
. This confused me for
a while when I was debugging things…
Stumbled upon the prost
crate which seems to be another implementation of protobufs in Rust. It seems simpler (but relies on macros). Also seems to play better with the bytes
crate.
- [X] Try out (and potentially switch to) prost