Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Initial implementation. #1

Merged
merged 16 commits into from
Apr 24, 2024
14 changes: 14 additions & 0 deletions .github/dependabot.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,14 @@
version: 2
updates:
- package-ecosystem: "github-actions"
directory: "/"
open-pull-requests-limit: 5
schedule:
interval: 'weekly'
day: 'tuesday'
- package-ecosystem: 'gomod'
directory: '/'
schedule:
interval: 'weekly'
day: 'tuesday'
open-pull-requests-limit: 5
42 changes: 42 additions & 0 deletions .github/workflows/build.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,42 @@
name: Go

on: [push]
permissions:
contents: read
pull-requests: read
checks: write
jobs:
build:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4

- name: Setup Go
uses: actions/setup-go@v4
with:
go-version-file: go.mod

- name: Verify
run: go mod verify

- name: Build
run: go build ./...

- name: golangci-lint
uses: golangci/golangci-lint-action@v4
with:
version: latest
install-mode: binary

- name: Prepare manifests for linting
run: |
mkdir manifests
go run deploy/main.go my-images v0.0.8 vanilla > manifests/vanilla.yaml
go run deploy/main.go my-images v0.0.8 ocp > manifests/ocp.yaml
go run deploy/main.go my-images v0.0.8 vanilla my-secret > manifests/vanilla-with-secret.yaml
go run deploy/main.go my-images v0.0.8 ocp my-secret > manifests/ocp-with-secret.yaml

- name: kube-linter
uses: stackrox/[email protected]
with:
directory: manifests
8 changes: 8 additions & 0 deletions Dockerfile
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
FROM golang:1.22 AS builder
LABEL authors="[email protected]"
COPY . /image-prefetcher
RUN cd /image-prefetcher && CGO_ENABLED=0 go build -a -ldflags '-extldflags "-static"' . && find . -ls

FROM scratch
COPY --from=builder /image-prefetcher/image-prefetcher /
CMD ["image-prefetcher"]
83 changes: 83 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,83 @@
# Image prefetcher

This is a utility for quickly fetching OCI images onto Kubernetes cluster nodes.

Talks directly to Container Runtime Interface ([CRI](https://kubernetes.io/docs/concepts/architecture/cri/)) API to:
- fetch all images on all nodes in parallel,
- retry pulls with increasingly longer timeouts. This prevents getting stuck on stalled connections to image registry.

## Architecture

### `image-prefetcher`

- main binary,
- meant to be run in pods of a DaemonSet,
- shipped as an OCI image,
- provides two subcommands:
- `fetch`: runs the actual image pulls via CRI, meant to run as an init container,
Requires access to the CRI UNIX domain socket from the host.
- `sleep`: just sleeps forever, meant to run as the main container,

### `deploy`

- a helper command-line utility for generating `image-prefetcher` manifests,
- separate go module, with no dependencies outside Go standard library.

## Usage

1. First, run the `deploy` binary to generate a manifest for an instance of `image-prefetcher`.

You can run many instances independently.

It requires a few arguments:
- **name** of the instance.
This also determines the name of a `ConfigMap` supplying names of images to fetch.
- `image-prefetcher` OCI image **version**. See [list of existing tags](https://quay.io/repository/mowsiany/image-prefetcher?tab=tags).
- **cluster flavor**. Currently one of:
- `vanilla`: a generic Kubernetes distribution without additional restrictions.
- `ocp`: OpenShift, which requires explicitly granting special privileges.
- optional **image pull `Secret` name**. Required if the images are not pullable anonymously.
This image pull secret should be usable for all images fetched by the given instance.
If provided, it must be of type `kubernetes.io/dockerconfigjson` and exist in the same namespace.

Example:

```
go run github.com/stackrox/image-prefetcher/deploy@main my-images v0.0.8 vanilla > manifest.yaml
porridge marked this conversation as resolved.
Show resolved Hide resolved
```

2. Prepare an image list. This should be a plain text file with one image name per line.
Lines starting with `#` and blank ones are ignored.
```
echo debian:latest >> image-list.txt
echo quay.io/strimzi/kafka:latest-kafka-3.7.0 >> image-list.txt
```

3. Deploy:
```
kubectl create namespace prefetch-images
kubectl create -n prefetch-images configmap my-images --from-file="images.txt=image-list.txt"
kubectl apply -n prefetch-images -f manifest.yaml
```

4. Wait for the pull to complete, with a timeout:
```
kubectl rollout -n prefetch-images status daemonset my-images --timeout 5m
```

5. If something goes wrong, look at logs:
```
kubectl logs -n prefetch-images daemonset/my-images -c prefetch
```

### Customization

You can tweak certain parameters such as timeouts by editing `args` in the above manifest.
See the [fetch command](./cmd/fetch.go) for accepted flags.

## Limitations

This utility was designed for small, ephemeral test clusters, in order to improve reliability and speed of end-to-end tests.

If deployed on larger clusters, it may have a "thundering herd" effect on the OCI registries it pulls from.
This is because all images are pulled from all nodes in parallel.
94 changes: 94 additions & 0 deletions cmd/fetch.go
Original file line number Diff line number Diff line change
@@ -0,0 +1,94 @@
package cmd

import (
"log/slog"
"os"
"strings"
"time"

"github.com/stackrox/image-prefetcher/internal"

"github.com/spf13/cobra"
)

// fetchCmd represents the fetch command
var fetchCmd = &cobra.Command{
Use: "fetch",
Short: "Fetch images using CRI.",
Long: `This subcommand is intended to run in an init container of pods of a DaemonSet.

It talks to Container Runtime Interface API to pull images in parallel, with retries.`,
RunE: func(cmd *cobra.Command, args []string) error {
opts := &slog.HandlerOptions{AddSource: true}
if debug {
opts.Level = slog.LevelDebug
}
logger := slog.New(slog.NewTextHandler(os.Stderr, opts))
timing := internal.TimingConfig{
ImageListTimeout: imageListTimeout,
InitialPullAttemptTimeout: initialPullAttemptTimeout,
MaxPullAttemptTimeout: maxPullAttemptTimeout,
OverallTimeout: overallTimeout,
InitialPullAttemptDelay: initialPullAttemptDelay,
MaxPullAttemptDelay: maxPullAttemptDelay,
}
imageList, err := loadImageNamesFromFile(imageListFile)
if err != nil {
return err
}
imageList = append(imageList, args...)
return internal.Run(logger, criSocket, dockerConfigJSONPath, timing, imageList...)
},
}

var (
criSocket string
dockerConfigJSONPath string
imageListFile string
debug bool
imageListTimeout = time.Minute
initialPullAttemptTimeout = 30 * time.Second
maxPullAttemptTimeout = 5 * time.Minute
overallTimeout = 20 * time.Minute
initialPullAttemptDelay = time.Second
maxPullAttemptDelay = 10 * time.Minute
)

func init() {
rootCmd.AddCommand(fetchCmd)

fetchCmd.Flags().StringVar(&criSocket, "cri-socket", "/run/containerd/containerd.sock", "Path to CRI UNIX socket.")
fetchCmd.Flags().StringVar(&dockerConfigJSONPath, "docker-config", "", "Path to docker config json file.")
fetchCmd.Flags().StringVar(&imageListFile, "image-list-file", "", "Path to text file containing images to pull (one per line).")
fetchCmd.Flags().BoolVar(&debug, "debug", false, "Whether to enable debug logging.")

fetchCmd.Flags().DurationVar(&imageListTimeout, "image-list-timeout", imageListTimeout, "Timeout for image list calls (for debugging).")
fetchCmd.Flags().DurationVar(&initialPullAttemptTimeout, "initial-pull-attempt-timeout", initialPullAttemptTimeout, "Timeout for initial image pull call. Each subsequent attempt doubles it until max.")
fetchCmd.Flags().DurationVar(&maxPullAttemptTimeout, "max-pull-attempt-timeout", maxPullAttemptTimeout, "Maximum timeout for image pull call.")
fetchCmd.Flags().DurationVar(&overallTimeout, "overall-timeout", overallTimeout, "Overall timeout for a single run.")
fetchCmd.Flags().DurationVar(&initialPullAttemptDelay, "initial-pull-attempt-delay", initialPullAttemptDelay, "Timeout for initial delay between pulls of the same image. Each subsequent attempt doubles it until max.")
fetchCmd.Flags().DurationVar(&maxPullAttemptDelay, "max-pull-attempt-delay", maxPullAttemptDelay, "Maximum delay between pulls of the same image.")
}

func loadImageNamesFromFile(fileName string) ([]string, error) {
if fileName == "" {
return nil, nil
}
bytes, err := os.ReadFile(fileName)
if err != nil {
return nil, err
}
return parseImageNames(bytes), nil
}

func parseImageNames(bytes []byte) []string {
var imageNames []string
for _, line := range strings.Split(string(bytes), "\n") {
janisz marked this conversation as resolved.
Show resolved Hide resolved
line = strings.TrimSpace(line)
if line == "" || strings.HasPrefix(line, "#") {
continue
}
imageNames = append(imageNames, line)
}
return imageNames
}
23 changes: 23 additions & 0 deletions cmd/root.go
Original file line number Diff line number Diff line change
@@ -0,0 +1,23 @@
package cmd

import (
"fmt"
"github.com/spf13/cobra"
"log"
)

// rootCmd represents the base command when called without any subcommands
var rootCmd = &cobra.Command{
Use: "image-prefetcher",
Short: "An image prefetching utility.",
Run: func(cmd *cobra.Command, args []string) {
fmt.Println("Please use one of the subcommands. See --help")
},
}

// Execute is the entry point to this program.
func Execute() {
if err := rootCmd.Execute(); err != nil {
log.Fatal(err)
}
}
26 changes: 26 additions & 0 deletions cmd/sleep.go
Original file line number Diff line number Diff line change
@@ -0,0 +1,26 @@
package cmd

import (
"github.com/spf13/cobra"
"os"
"os/signal"
"syscall"
)

// sleepCmd represents the sleep command
var sleepCmd = &cobra.Command{
Use: "sleep",
Short: "Sleep forever.",
Long: `This can be used as main container of a DaemonSet to avoid having to pull another image.`,
Run: func(cmd *cobra.Command, args []string) {
println("Sleeping...")
cancelChan := make(chan os.Signal, 1)
signal.Notify(cancelChan, syscall.SIGTERM, syscall.SIGINT)
s := <-cancelChan
println("Terminating due to", s)
},
}

func init() {
rootCmd.AddCommand(sleepCmd)
}
Loading
Loading