Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: add zerotier extension to Talos #596

Open
wants to merge 4 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from 3 commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -103,6 +103,7 @@ TARGETS += v4l-uvc-drivers
TARGETS += vmtoolsd-guest-agent
TARGETS += wasmedge
TARGETS += xen-guest-agent
TARGETS += zerotier
TARGETS += zfs
NONFREE_TARGETS = nonfree-kmod-nvidia-lts
NONFREE_TARGETS += nonfree-kmod-nvidia-production
Expand Down Expand Up @@ -249,4 +250,3 @@ release-notes: $(ARTIFACTS)
conformance:
@docker pull $(CONFORMANCE_IMAGE)
@docker run --rm -it -v $(PWD):/src -w /src $(CONFORMANCE_IMAGE) enforce

2 changes: 2 additions & 0 deletions network/vars.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -6,3 +6,5 @@ LLDPD_VERSION: 1.0.19
CLOUDFLARED_VERSION: 2024.12.1
# renovate: datasource=github-releases extractVersion=^v(?<version>.*)$ depName=slackhq/nebula
NEBULA_VERSION: 1.9.5
# renovate: datasource=github-releases depName=zerotier/ZeroTierOne
ZEROTIER_VERSION: 1.14.2
60 changes: 60 additions & 0 deletions network/zerotier/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,60 @@
# ZeroTier

Adds https://zerotier.com network interfaces as system extensions.
This means you can access your Talos nodes from machines you have configured
with ZeroTier, creating a secure overlay network.

## Installation

See [Installing Extensions](https://github.com/siderolabs/extensions#installing-extensions).

## Usage

Configure the extension via `ExtensionServiceConfig` document.

```yaml
---
apiVersion: v1alpha1
kind: ExtensionServiceConfig
name: zerotier
environment:
- ZEROTIER_NETWORK=<your network id>
```

Then apply the patch to your node's MachineConfigs

```bash
talosctl patch mc -p @zerotier-config.yaml
```

You can then verify that it is in place with the following command

```bash
talosctl get extensionserviceconfigs

NODE NAMESPACE TYPE ID VERSION
mynode runtime ExtensionServiceConfig zerotier 1
```

## Configuration

The extension can be configured through environment variables:

- `ZEROTIER_NETWORK`: The network ID to join (required)
- `ZEROTIER_IDENTITY_SECRET`: Optional pre-existing identity to use (format: "address:0:public:private")

### Using an existing identity

If you want to maintain the same ZeroTier identity across rebuilds or different nodes, you can specify an existing identity:

```yaml
---
apiVersion: v1alpha1
kind: ExtensionServiceConfig
name: zerotier
environment:
- ZEROTIER_NETWORK=<your network id>
- ZEROTIER_IDENTITY_SECRET=<identity string>
```

If no identity is provided, a new one will be generated automatically. (You may need to authorize this node in your Zerotier network according to your network policies before it will recieve an IP address).
10 changes: 10 additions & 0 deletions network/zerotier/manifest.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,10 @@
version: v1alpha1
metadata:
name: zerotier
version: "$VERSION"
author: Hive Technologies
description: |
Connect your Talos cluster into a zerotier network
compatibility:
talos:
version: ">= v1.8.0"
53 changes: 53 additions & 0 deletions network/zerotier/pkg.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,53 @@
name: zerotier
variant: alpine
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

any reason alpine is used, should just build fine with our toolchain

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've updated to use the toolchain, it didn't build for me using the toolchain originally hence alpine but it builds just fine now - so that's great

shell: /bin/bash
dependencies:
- stage: base
from: /
to: /base-rootfs
- stage: zerotier-wrapper
install:
- build-base
- linux-headers
- libstdc++
steps:
-
sources:
- url: https://github.com/zerotier/ZeroTierOne/archive/refs/tags/{{ .ZEROTIER_VERSION }}.tar.gz
destination: zerotier.tar.gz
sha256: c2f64339fccf5148a7af089b896678d655fbfccac52ddce7714314a59d7bddbb
sha512: 9d022afcf81543d6ee938219a3712da846fe895b0fd65cfd6ec8ed173f0e208516031b6d2303ab42fd21806d9ba5ff6fdb0d850a0cbb32b268d53accb093cdf5
env:
CXXFLAGS: '-Os -fstack-protector -std=c++17 -pthread'
LDFLAGS: '-static'
prepare:
- |
sed -i 's#$VERSION#{{ .VERSION }}#' /pkg/manifest.yaml
- |
tar -xzvf zerotier.tar.gz --strip-components=1
build:
- |
sed -i '2i #include <cmath>' ext/prometheus-cpp-lite-1.0/core/include/prometheus/text_serializer.h
make ONE_THREAD=1 ZT_SSO_SUPPORTED=0 STATIC=1 -j $(nproc)
install:
- |
mkdir -p /rootfs/usr/local/lib/containers/zerotier/usr/local/bin/
cp -pr zerotier-one /rootfs/usr/local/lib/containers/zerotier/usr/local/bin/
cp -pr /rootfs/usr/local/bin/zerotier-wrapper /rootfs/usr/local/lib/containers/zerotier/usr/local/bin/
chmod +x /rootfs/usr/local/lib/containers/zerotier/usr/local/bin/zerotier-*
- |
mkdir -p /rootfs/usr/local/etc/containers/zerotier/usr/local/etc/zerotier/state
cp /pkg/zerotier.yaml /rootfs/usr/local/etc/containers/
test:
- |
mkdir -p /extensions-validator-rootfs
cp -r /rootfs/ /extensions-validator-rootfs/rootfs
cp /pkg/manifest.yaml /extensions-validator-rootfs/manifest.yaml
/base-rootfs/extensions-validator validate --rootfs=/extensions-validator-rootfs --pkg-name="${PKG_NAME}"
- |
[[ $(/rootfs/usr/local/lib/containers/zerotier/usr/local/bin/zerotier-one -v) == *{{ .ZEROTIER_VERSION }}* ]]
finalize:
- from: /rootfs
to: /rootfs
- from: /pkg/manifest.yaml
to: /
1 change: 1 addition & 0 deletions network/zerotier/vars.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
VERSION: "{{ .ZEROTIER_VERSION }}"
5 changes: 5 additions & 0 deletions network/zerotier/zerotier-wrapper/go.mod
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
module zerotier-wrapper

go 1.23.0

require golang.org/x/sys v0.30.0
2 changes: 2 additions & 0 deletions network/zerotier/zerotier-wrapper/go.sum
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
golang.org/x/sys v0.30.0 h1:QjkSwP/36a20jFYWkSue1YwXzLmsV5Gfq7Eiy72C1uc=
golang.org/x/sys v0.30.0/go.mod h1:/VUhepiaJMQUp4+oa/7Zr1D23ma6VTLIYjOOTFZPUcA=
232 changes: 232 additions & 0 deletions network/zerotier/zerotier-wrapper/main.go
Original file line number Diff line number Diff line change
@@ -0,0 +1,232 @@
package main
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so this basically makes generating the pub key easier?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just trying to see if needing the wrapper makes sense if user could provide secret and pub key as extension service config

Copy link

@rob-htl rob-htl Feb 23, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In short yes. There has to be a unique identity per node.

The usage guide from Zerotier boils down to:

  • Install from Zerotier repos
  • (the debian package generates the identity post-install - presumably the others do too as it's not handled by the following commands)
  • use zerotier-cli join <network_id> (i.e zerotier-one -q join <network_id>)
  • authorize the node at the zerotier end (the most common way to do this is via the web interface but there are other ways to do it programatically)
  • the node is connected

The alternative approach (without the wrapper) is for an admin to have zerotier installed on their workstation and each time a new node is added to a Talos cluster generate the identities one by one (zerotier-idtool / zerotier-one -i) and create a MachineConfig per node. 

Without identity files in place the Talos node there will be no error, the zerotier service will start and display this output only (which could cause user confusion).

Starting Control Plane...
Starting V6 Control Plane...

This wrapper allows the admin to have one config patch per cluster for the zerotier network id (with a feature to allow the identity to be provided via env per node and be validated), and it provides more robust logging output.


import (
"bytes"
"errors"
"fmt"
"log"
"os"
"os/exec"
"os/signal"
"strconv"
"strings"
"time"

"golang.org/x/sys/unix"
)

const (
zerotierPath = "/var/lib/zerotier-one"
identityPath = "/var/lib/zerotier-one/identity.secret"
identityPubPath = "/var/lib/zerotier-one/identity.public"
pidFile = "/var/lib/zerotier-one/zerotier-one.pid"
zerotierBinPath = "/usr/local/bin/zerotier-one"
)

func main() {
log.Printf("zerotier-wrapper: initializing...")

// Ensure the ZeroTier state directory exists.
if err := os.MkdirAll(zerotierPath, 0755); err != nil {
log.Fatalf("failed to create state directory: %v", err)
}

// Ensure identity configuration.
identitySource, err := ensureIdentity()
if err != nil {
log.Fatalf("identity configuration failed: %v", err)
}
log.Printf("identity configured (source: %s)", identitySource)

// Cleanup any existing zerotier-one process.
if err := cleanupProcess(); err != nil {
log.Fatalf("process cleanup failed: %v", err)
}

// If ZEROTIER_NETWORK env var is set, join network using zerotier-one -q.
if network := os.Getenv("ZEROTIER_NETWORK"); network != "" {
log.Printf("will join network %s after startup", network)
go func() {
time.Sleep(2 * time.Second)
if err := joinNetwork(network); err != nil {
log.Printf("failed to join network: %v", err)
} else {
log.Printf("joined network %s", network)
}
}()
}

// Start zerotier-one process.
cmd := exec.Command(zerotierBinPath, "-U", zerotierPath)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is making this more complicated, this can be replaced via unix.Exec making all the pid handling not needed

Copy link

@rob-htl rob-htl Feb 23, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

exec.Command was due to using zerotier-one -q join <network_id> requiring the service to be up. However all "joining" the network entails is creating an empty config file with the network name, so I have done that to join instead and removed exec.Command and all PID handling.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we can also split this into two services, on that runs once only, and it doesn't run if file already exists

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't quite follow, are you suggesting something like the following?

  • zerotier-setup: create identity, join network etc etc
  • zerotier-wrapper: checks that setup has been done and unix.Exec zerotier-one

e.g.

container:
  entrypoint: /bin/sh
  args:
    - -c
    - |
      /usr/local/bin/zerotier-setup && /usr/local/bin/zerotier-wrapper

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we can also split this into two services, on that runs once only, and it doesn't run if file already exists

@frezbo are you able to clarify what you want here please? I'm keen to get it done and resolved

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sorry, I'll try to get back to this by this week or next, don't worry, this will go in 1.10, just finishing other tasks

cmd.Stdout = os.Stdout
cmd.Stderr = os.Stderr

if err := cmd.Start(); err != nil {
log.Fatalf("error starting zerotier-one: %v", err)
}

// Write the PID file.
pidStr := strconv.Itoa(cmd.Process.Pid)
if err := os.WriteFile(pidFile, []byte(pidStr), 0644); err != nil {
log.Printf("failed to write PID file: %v", err)
}

// Forward termination signals to the zerotier-one process.
ch := make(chan os.Signal, 1)
signal.Notify(ch, unix.SIGINT, unix.SIGTERM)
sig := <-ch
log.Printf("received signal %v, forwarding to zerotier-one process", sig)
if err := cmd.Process.Signal(sig); err != nil {
log.Fatalf("error sending signal to zerotier-one: %v", err)
}

if err := cmd.Wait(); err != nil {
log.Fatalf("zerotier-one exited with error: %v", err)
}
}

// ensureIdentity checks for an existing identity file, validates it if found,
// or else uses the identity from the ZEROTIER_IDENTITY_SECRET environment variable (after validation)
// or generates a new one using "zerotier-one -i generate".
func ensureIdentity() (string, error) {
// If the identity file exists, validate its contents.
if _, err := os.Stat(identityPath); err == nil {
data, err := os.ReadFile(identityPath)
if err != nil {
return "", fmt.Errorf("failed to read existing identity: %w", err)
}
identity := strings.TrimSpace(string(data))
log.Printf("found existing identity at %s, validating...", identityPath)
if err := validateIdentity(identity); err != nil {
return "", fmt.Errorf("existing identity failed validation: %w", err)
}
log.Printf("existing identity validated")
return "existing", nil
} else if !errors.Is(err, os.ErrNotExist) {
return "", fmt.Errorf("failed to stat identity file: %w", err)
}

// Check for identity in environment.
if identity := os.Getenv("ZEROTIER_IDENTITY_SECRET"); identity != "" {
log.Printf("found identity in ZEROTIER_IDENTITY_SECRET environment variable, validating...")
if err := validateIdentity(identity); err != nil {
return "", fmt.Errorf("environment identity invalid: %w", err)
}
log.Printf("environment identity validated")
if err := writeIdentity(identity); err != nil {
return "", fmt.Errorf("failed to write identity from environment: %w", err)
}
return "environment", nil
}

// Generate a new identity using "zerotier-one -i generate".
log.Printf("generating new identity using zerotier-one -i generate")
cmd := exec.Command(zerotierBinPath, "-i", "generate")
var out bytes.Buffer
cmd.Stdout = &out
cmd.Stderr = os.Stderr
if err := cmd.Run(); err != nil {
return "", fmt.Errorf("failed to generate identity: %w", err)
}
identity := strings.TrimSpace(out.String())
if err := validateIdentity(identity); err != nil {
return "", fmt.Errorf("generated identity failed validation: %w", err)
}
if err := writeIdentity(identity); err != nil {
return "", fmt.Errorf("failed to write generated identity: %w", err)
}
return "generated", nil
}

// validateIdentity runs "zerotier-one -i validate <identity>" to ensure the identity is valid.
func validateIdentity(identity string) error {
cmd := exec.Command(zerotierBinPath, "-i", "validate", identity)
cmd.Stdout = os.Stdout
cmd.Stderr = os.Stderr
if err := cmd.Run(); err != nil {
return fmt.Errorf("identity validation failed: %w", err)
}
return nil
}

// writeIdentity writes the complete identity string (all four parts) to identity.secret,
// while writing only the first three parts (separated by ':') to identity.public.
func writeIdentity(identity string) error {
parts := strings.Split(identity, ":")
if len(parts) != 4 {
return fmt.Errorf("invalid identity format: expected 4 parts, got %d", len(parts))
}

// Write the secret identity file with the full identity.
if err := os.WriteFile(identityPath, []byte(identity), 0600); err != nil {
return fmt.Errorf("failed to write secret identity: %w", err)
}
log.Printf("wrote secret identity to %s", identityPath)

// Write the public identity file with only the first 3 parts.
public := strings.Join(parts[:3], ":")
if err := os.WriteFile(identityPubPath, []byte(public), 0644); err != nil {
return fmt.Errorf("failed to write public identity: %w", err)
}
log.Printf("wrote public identity to %s", identityPubPath)

return nil
}

// cleanupProcess checks for an existing PID file; if found, it kills the process and removes the file.
func cleanupProcess() error {
if _, err := os.Stat(pidFile); err == nil {
pid, err := getProcessId()
if err != nil {
return fmt.Errorf("error reading pid file: %w", err)
}
if err := killProcess(pid); err != nil {
return fmt.Errorf("error killing process: %w", err)
}
if err := os.Remove(pidFile); err != nil {
return fmt.Errorf("error removing pid file: %w", err)
}
log.Printf("cleaned up existing process (PID %d)", pid)
} else if !errors.Is(err, os.ErrNotExist) {
return fmt.Errorf("failed to stat pid file: %w", err)
} else {
log.Printf("no PID file found, no existing process to clean up")
}
return nil
}

func getProcessId() (int, error) {
pidData, err := os.ReadFile(pidFile)
if err != nil {
return 0, err
}
pidData = bytes.TrimRight(pidData, "\n")
pid, err := strconv.Atoi(string(pidData))
if err != nil {
return 0, err
}
return pid, nil
}

func killProcess(pid int) error {
p, err := os.FindProcess(pid)
if err != nil {
return err
}
if err := p.Kill(); err != nil && !errors.Is(err, os.ErrProcessDone) {
return err
}
return nil
}

// joinNetwork uses "zerotier-one -q join <network>" to join the specified network.
func joinNetwork(network string) error {
log.Printf("attempting to join network %s", network)
cmd := exec.Command(zerotierBinPath, "-q", "join", network)
cmd.Stdout = os.Stdout
cmd.Stderr = os.Stderr
if err := cmd.Run(); err != nil {
return fmt.Errorf("join network failed: %w", err)
}
return nil
}
Loading