Skip to content

Commit

Permalink
[wicketd] Store TUF artifacts on disk instead of in memory (#3953)
Browse files Browse the repository at this point in the history
Prior to this PR, `wicketd` could end up with quite a large heap because
we were keeping several things in memory:

1. When a TUF repo was uploaded from `wicket`, we streamed it into
memory
2. After extracting the TUF repo into a temporary directory, we
neglected to free it until _after_ we'd ingested all its contents
(pushing our high-water mark `$TUF_REPO_SIZE` higher than it needed to
be)
3. We then read every extracted artifact from the TUF repo back into
memory
4. We then further extract the contents of all RoT (tiny) and host OS
(not at all tiny) artifacts to get at their inner artifacts (A/B images
for the RoT; phase1 / phase2 blobs for OS images)

We then kept all the in-memory artifacts from 3 and 4 around
indefinitely. If a new TUF repo is uploaded, we would repeat the above
1-4, and only free the item 3 and 4 artifacts from the first repo
_after_ successfully finishing all four steps on the second repo.

This leads to some large heaps:

1. After uploading one TUF repo:

```
root@oxz_switch:/opt/oxide# pgrep wicketd | xargs pmap -S | grep heap
0000000002979000    3390440    3390440 rw---    [ heap ]
00000000D1873000    2392080    2392080 rw---    [ heap ]
```

2. After uploading a second TUF repo:

```
root@oxz_switch:/opt/oxide# pgrep wicketd | xargs pmap -S | grep heap
0000000002979000    3390440    3390440 rw---    [ heap ]
00000000D1873000    3424296    3424296 rw---    [ heap ]
00000001A287D000    2392080    2392080 rw---    [ heap ]
```

Actually performing updates does not grow these sizes.

---

This change makes some significant cuts on the above by shifting several
things out of memory and into files on disk (currently all stored in
tempdirs under `/tmp`, so still RAM, but we could move them to a
physical disk if desired):

1. When receiving a TUF repo from wicket, we now stream it directly to a
temporary file instead of into memory, and we extract from that temp
file to a temp directory.
2. Instead of reading all the TUF contents into memory, we now copy them
into a temporary directory (where "temporary" here is perhaps
misleading: we keep this directory around until a new TUF repo is
uploaded); we then read them (in a streaming fashion: not entirely into
memory) from this directory on-demand while updates are performed

Heap usage as of this PR after uploading one TUF repo:

```
BRM44220001 # pgrep wicketd | xargs pmap -S | grep heap
00000000029D7000      36216      36216 rw---    [ heap ]
```

After uploading a second TUF repo:

```
BRM44220001 # pgrep wicketd | xargs pmap -S | grep heap
00000000029D7000      48708      48708 rw---    [ heap ]
```

Neither uploading additional repos nor running updates grows this size.

This change has some tradeoffs:

1. Ingesting an uploaded TUF repo is slower, but tolerably so (~10
seconds on main, ~13 seconds on this PR)
2. We now have more potential I/O errors, including possibly failing to
open files that we expect to be in our temporary directory when they're
needed during an update
3. We're using a nontrivial amount of space under `/tmp` (high water
mark is during the ingest of a second TUF repo, where we would have all
the extracted artifacts of both repos in separate tempdirs just before
we remove one of them)

but these are probably worth it in light of RFD 413 / #3943.

Expanding on item 3 on `/tmp` usage slightly, in case it ends up being
valuable:

1. When `wicketd` first starts, it is not using `/tmp`.
2. As a repo is uploaded, it is streamed into a file under `/tmp` (space
used: roughly 1.2 GiB currently)
3. Once the upload is complete, the repo is unzipped into a temporary
directory under `/tmp` (space used: the size of the repo .zip file plus
the size of the unpacked repo; roughly 1.2 GiB + 1.2 GiB = 2.4 GiB
currently)
4. The temp file from step 2 is deleted (space used: back down to the
size of the unpacked repo; roughly 1.2 GiB)
5. wicketd creates a new temporary directory with a somewhat meaningful
name (`wicketd-update-artifacts.ZZZZZZ`), then iterates through the
artifacts in the repo, copying some and further extracting others; e.g.,
host and trampoline artifacts are expanded into their phase1 and phase2
components (space used: the size of the unpacked repo plus the size of
the unpacked artifacts; roughly 1.2 GiB + 1.4 GiB = 2.6 GiB currently)
6. The unpacked repo temp directory from step 3 is deleted (space used:
back down to the size of the unpacked artifacts: roughly 1.4 GiB
currently)

If a _second_ repo is then uploaded, we go through all of the above
steps again, but we still have a `wicketd-update-artifacts.ZZZZZZ` from
the first repo that is not removed until we get to the end of step 6,
making our high water step 5 of a second (or later) repo upload, where
we're using a total of (current values in parens):

```
size of extracted artifacts from first repo (1.4 GiB) +  size of expanded second repo (1.2 GiB) + size of extracted artifacts from second repo (1.4 GiB) = maximal space used under /tmp (4.0 GiB)
```

---

Fixes #3943.
  • Loading branch information
jgallagher authored Sep 12, 2023
1 parent 77fb3be commit 951651a
Show file tree
Hide file tree
Showing 23 changed files with 2,402 additions and 1,437 deletions.
2 changes: 1 addition & 1 deletion Cargo.lock

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

39 changes: 39 additions & 0 deletions common/src/update.rs
Original file line number Diff line number Diff line change
Expand Up @@ -61,6 +61,11 @@ impl Artifact {
kind: self.kind.clone(),
}
}

/// Returns the artifact ID for this artifact without clones.
pub fn into_id(self) -> ArtifactId {
ArtifactId { name: self.name, version: self.version, kind: self.kind }
}
}

/// An identifier for an artifact.
Expand Down Expand Up @@ -165,6 +170,40 @@ impl ArtifactKind {
/// These artifact kinds are not stored anywhere, but are derived from stored
/// kinds and used as internal identifiers.
impl ArtifactKind {
/// Gimlet root of trust A slot image identifier.
///
/// Derived from [`KnownArtifactKind::GimletRot`].
pub const GIMLET_ROT_IMAGE_A: Self =
Self::from_static("gimlet_rot_image_a");

/// Gimlet root of trust B slot image identifier.
///
/// Derived from [`KnownArtifactKind::GimletRot`].
pub const GIMLET_ROT_IMAGE_B: Self =
Self::from_static("gimlet_rot_image_b");

/// PSC root of trust A slot image identifier.
///
/// Derived from [`KnownArtifactKind::PscRot`].
pub const PSC_ROT_IMAGE_A: Self = Self::from_static("psc_rot_image_a");

/// PSC root of trust B slot image identifier.
///
/// Derived from [`KnownArtifactKind::PscRot`].
pub const PSC_ROT_IMAGE_B: Self = Self::from_static("psc_rot_image_b");

/// Switch root of trust A slot image identifier.
///
/// Derived from [`KnownArtifactKind::SwitchRot`].
pub const SWITCH_ROT_IMAGE_A: Self =
Self::from_static("switch_rot_image_a");

/// Switch root of trust B slot image identifier.
///
/// Derived from [`KnownArtifactKind::SwitchRot`].
pub const SWITCH_ROT_IMAGE_B: Self =
Self::from_static("switch_rot_image_b");

/// Host phase 1 identifier.
///
/// Derived from [`KnownArtifactKind::Host`].
Expand Down
6 changes: 4 additions & 2 deletions gateway/src/lib.rs
Original file line number Diff line number Diff line change
Expand Up @@ -100,10 +100,12 @@ fn start_dropshot_server(
let http_server_starter = dropshot::HttpServerStarter::new(
&dropshot,
http_entrypoints::api(),
Arc::clone(&apictx),
Arc::clone(apictx),
&log.new(o!("component" => "dropshot")),
)
.map_err(|error| format!("initializing http server: {}", error))?;
.map_err(|error| {
format!("initializing http server listening at {addr}: {}", error)
})?;

match http_servers.entry(addr) {
Entry::Vacant(slot) => {
Expand Down
24 changes: 1 addition & 23 deletions installinator-artifactd/src/http_entrypoints.rs
Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,7 @@ use dropshot::{
};
use hyper::{header, Body, StatusCode};
use installinator_common::EventReport;
use omicron_common::update::{ArtifactHashId, ArtifactId};
use omicron_common::update::ArtifactHashId;
use schemars::JsonSchema;
use serde::Deserialize;
use uuid::Uuid;
Expand All @@ -25,7 +25,6 @@ pub fn api() -> ArtifactServerApiDesc {
fn register_endpoints(
api: &mut ArtifactServerApiDesc,
) -> Result<(), String> {
api.register(get_artifact_by_id)?;
api.register(get_artifact_by_hash)?;
api.register(report_progress)?;
Ok(())
Expand All @@ -38,27 +37,6 @@ pub fn api() -> ArtifactServerApiDesc {
api
}

/// Fetch an artifact from this server.
#[endpoint {
method = GET,
path = "/artifacts/by-id/{kind}/{name}/{version}"
}]
async fn get_artifact_by_id(
rqctx: RequestContext<ServerContext>,
// NOTE: this is an `ArtifactId` and not an `UpdateArtifactId`, because this
// code might be dealing with an unknown artifact kind. This can happen
// if a new artifact kind is introduced across version changes.
path: Path<ArtifactId>,
) -> Result<HttpResponseHeaders<HttpResponseOk<FreeformBody>>, HttpError> {
match rqctx.context().artifact_store.get_artifact(&path.into_inner()).await
{
Some((size, body)) => Ok(body_to_artifact_response(size, body)),
None => {
Err(HttpError::for_not_found(None, "Artifact not found".into()))
}
}
}

/// Fetch an artifact by hash.
#[endpoint {
method = GET,
Expand Down
13 changes: 1 addition & 12 deletions installinator-artifactd/src/store.rs
Original file line number Diff line number Diff line change
Expand Up @@ -10,16 +10,13 @@ use async_trait::async_trait;
use dropshot::HttpError;
use hyper::Body;
use installinator_common::EventReport;
use omicron_common::update::{ArtifactHashId, ArtifactId};
use omicron_common::update::ArtifactHashId;
use slog::Logger;
use uuid::Uuid;

/// Represents a way to fetch artifacts.
#[async_trait]
pub trait ArtifactGetter: fmt::Debug + Send + Sync + 'static {
/// Gets an artifact, returning it as a [`Body`] along with its length.
async fn get(&self, id: &ArtifactId) -> Option<(u64, Body)>;

/// Gets an artifact by hash, returning it as a [`Body`].
async fn get_by_hash(&self, id: &ArtifactHashId) -> Option<(u64, Body)>;

Expand Down Expand Up @@ -63,14 +60,6 @@ impl ArtifactStore {
Self { log, getter: Box::new(getter) }
}

pub(crate) async fn get_artifact(
&self,
id: &ArtifactId,
) -> Option<(u64, Body)> {
slog::debug!(self.log, "Artifact requested: {:?}", id);
self.getter.get(id).await
}

pub(crate) async fn get_artifact_by_hash(
&self,
id: &ArtifactHashId,
Expand Down
6 changes: 4 additions & 2 deletions installinator/src/write.rs
Original file line number Diff line number Diff line change
Expand Up @@ -953,7 +953,9 @@ mod tests {
Event, InstallinatorCompletionMetadata, InstallinatorComponent,
InstallinatorStepId, StepEventKind, StepOutcome,
};
use omicron_common::api::internal::nexus::KnownArtifactKind;
use omicron_common::{
api::internal::nexus::KnownArtifactKind, update::ArtifactKind,
};
use omicron_test_utils::dev::test_setup_log;
use partial_io::{
proptest_types::{
Expand Down Expand Up @@ -1072,7 +1074,7 @@ mod tests {
data2.into_iter().map(Bytes::from).collect();

let host_id = ArtifactHashId {
kind: KnownArtifactKind::Host.into(),
kind: ArtifactKind::HOST_PHASE_2,
hash: {
// The `validate_written_host_phase_2_hash()` will fail unless
// we give the actual hash of the host phase 2 data, so compute
Expand Down
55 changes: 0 additions & 55 deletions openapi/installinator-artifactd.json
Original file line number Diff line number Diff line change
Expand Up @@ -53,57 +53,6 @@
}
}
},
"/artifacts/by-id/{kind}/{name}/{version}": {
"get": {
"summary": "Fetch an artifact from this server.",
"operationId": "get_artifact_by_id",
"parameters": [
{
"in": "path",
"name": "kind",
"description": "The kind of artifact this is.",
"required": true,
"schema": {
"type": "string"
}
},
{
"in": "path",
"name": "name",
"description": "The artifact's name.",
"required": true,
"schema": {
"type": "string"
}
},
{
"in": "path",
"name": "version",
"description": "The artifact's version.",
"required": true,
"schema": {
"$ref": "#/components/schemas/SemverVersion"
}
}
],
"responses": {
"200": {
"description": "",
"content": {
"*/*": {
"schema": {}
}
}
},
"4XX": {
"$ref": "#/components/responses/Error"
},
"5XX": {
"$ref": "#/components/responses/Error"
}
}
}
},
"/report-progress/{update_id}": {
"post": {
"summary": "Report progress and completion to the server.",
Expand Down Expand Up @@ -2373,10 +2322,6 @@
"slots_attempted",
"slots_written"
]
},
"SemverVersion": {
"type": "string",
"pattern": "^(0|[1-9]\\d*)\\.(0|[1-9]\\d*)\\.(0|[1-9]\\d*)(?:-((?:0|[1-9]\\d*|\\d*[a-zA-Z-][0-9a-zA-Z-]*)(?:\\.(?:0|[1-9]\\d*|\\d*[a-zA-Z-][0-9a-zA-Z-]*))*))?(?:\\+([0-9a-zA-Z-]+(?:\\.[0-9a-zA-Z-]+)*))?$"
}
}
}
Expand Down
40 changes: 39 additions & 1 deletion openapi/wicketd.json
Original file line number Diff line number Diff line change
Expand Up @@ -724,6 +724,25 @@
"message"
]
},
"ArtifactHashId": {
"description": "A hash-based identifier for an artifact.\n\nSome places, e.g. the installinator, request artifacts by hash rather than by name and version. This type indicates that.",
"type": "object",
"properties": {
"hash": {
"description": "The hash of the artifact.",
"type": "string",
"format": "hex string (32 bytes)"
},
"kind": {
"description": "The kind of artifact this is.",
"type": "string"
}
},
"required": [
"hash",
"kind"
]
},
"ArtifactId": {
"description": "An identifier for an artifact.\n\nThe kind is [`ArtifactKind`], indicating that it might represent an artifact whose kind is unknown.",
"type": "object",
Expand Down Expand Up @@ -1154,9 +1173,10 @@
"type": "object",
"properties": {
"artifacts": {
"description": "Map of artifacts we ingested from the most-recently-uploaded TUF repository to a list of artifacts we're serving over the bootstrap network. In some cases the list of artifacts being served will have length 1 (when we're serving the artifact directly); in other cases the artifact in the TUF repo contains multiple nested artifacts inside it (e.g., RoT artifacts contain both A and B images), and we serve the list of extracted artifacts but not the original combination.\n\nConceptually, this is a `BTreeMap<ArtifactId, Vec<ArtifactHashId>>`, but JSON requires string keys for maps, so we give back a vec of pairs instead.",
"type": "array",
"items": {
"$ref": "#/components/schemas/ArtifactId"
"$ref": "#/components/schemas/InstallableArtifacts"
}
},
"event_reports": {
Expand Down Expand Up @@ -1301,6 +1321,24 @@
}
}
},
"InstallableArtifacts": {
"type": "object",
"properties": {
"artifact_id": {
"$ref": "#/components/schemas/ArtifactId"
},
"installable": {
"type": "array",
"items": {
"$ref": "#/components/schemas/ArtifactHashId"
}
}
},
"required": [
"artifact_id",
"installable"
]
},
"IpRange": {
"oneOf": [
{
Expand Down
Loading

0 comments on commit 951651a

Please sign in to comment.