Add a boot device field to Instance, forward along as appropriate #6585

iximeow · 2024-09-16T23:36:18Z

the Omicron side of adding explicit boot order selection to instances (counterpart to propolis#756).

first, this extends params::InstanceCreate to take a new boot_disk: Option<params::InstanceDiskAttachment>.

additionally, this adds a PUT /v1/instances/{instance} to update instances. the only property that can be updated at the moment is boot_disk, pick a new boot disk or unset it entirely. this also partially subsumes #6321.

finally, this updates Omicron to reference a recent enough Propolis that #756 is included.

a surprising discovery along the way: you can specify a disk to be attached multiple times in disks today, when creating an instance, and we're fine with it! this carries through with the new boot_disk field: if you specify the disk as boot_disk and in disks, it is technically listing the disk for attachment twice but this will succeed.

iximeow · 2024-09-16T23:56:53Z

nexus/db-model/src/instance.rs

@@ -63,6 +63,10 @@ pub struct Instance {
    #[diesel(column_name = auto_restart_policy)]
    pub auto_restart_policy: Option<InstanceAutoRestart>,

+    /// The primary boot device for this instance.
+    #[diesel(column_name = boot_device)]
+    pub boot_device: Option<String>,


if this were just the field it is, it should be Name, since it is .. the name of some other boot device. but on the Propolis side we'll actually accept something more flexible.

@ahl @askfongjojo @karencfv and @david-crespo i'd like your thoughts on how much we should carry that through: recording a BootOrder struct which we can then extend in the future is certainly nice for internal purposes, but i'm less sure how to judge the complexity/extensibility tradeoff for the more user-facing API here.

if we have just "boot device" here it seems straightforward in all uses today, but allowing a future boot order list, or "first boot only"-style settings on boot entries, means we'd need a new field and presumably to allow only one of {boot_device, future_more_complex_list}.

i'm inclined towards keeping just the single optional boot_device string to avoid the extra layers in user-facing APIs, but i don't know if there's other context i be aware of.

Due to names being mutable, I'd prefer the ID of the boot device. For users who rely on automation to manage their infrastructure, and immutable identifier is always preferable. Like how we have project_id as part of this struct :)

i'm inclined towards keeping just the single optional boot_device string to avoid the extra layers in user-facing APIs, but i don't know if there's other context i be aware of.

I agree with this, it seems more future proof as well

Yeah, ID would be much more typical. Often we allow something to be set by NameOrId, but we resolve it to an ID immediately and store it as an ID.

oh! i wasn't sure that they were mutable. in that case, yes, the ID would be much better. it turns out the name is what gets sent out to Propolis, so i'd just stuck with that here, but perhaps Propolis should be getting IDs as well.

I'd definitely advocate for changing the Propolis API to take a UUID, if at all possible.

when we're creating the instance, we require that InstanceDiskAttach specify a name, and not an ID. we probably ought to take NameOrId there. otherwise allowing both here seems to be of limited utility.. the use cases afaict look like:

if you're creating new disks along with the instance creation, you don't have IDs yet, so you couldn't provide an ID to boot_disk. simple enough, strong indicator we should at least take Name here.

if you're attaching existing disks, you must use their Name, so passing a disk ID here seems like net-more work than just reusing the name you'd have put in disks[] already.

if you're updating the boot_disk of a stopped instance, that seems like the most likely case for an ID on its own to be especially helpful.

i don't want to go change InstanceDiskAttach here too (basically just because i don't know if that is laden with secret cans of worms), but if folks agree it seems like it would be nice to allow NameOrId sooner than later for disk attachment.

Yeah, we did intend to change that, but I never got around to wrapping it up after Justin left:

#3671

but if folks agree it seems like it would be nice to allow NameOrId sooner than later for disk attachment.

Definitely on board with this!

nexus/db-queries/src/db/datastore/disk.rs

openapi/sled-agent.json

schema/crdb/dbinit.sql

iximeow · 2024-09-17T00:35:26Z

one more note: i do expect to land this with a mechanism to update the boot order on a stopped instance. after reading #6321 this morning it looks like there's consensus around this kind of reconfiguration being a PUT to instance. so i'll expect to add a PUT /v1/instances route here and we can extend it to allow reconfiguring instance vCPUs/memory later.

nexus/types/src/external_api/params.rs

schema/crdb/add-instance-boot-device/up.sql

schema/crdb/dbinit.sql

additionally, add a general-purpose instance update endpoint to change the boot device for an existing instance, as well as tests that the constraints around what we can do with boot devices and the boot device field are checked.

if the disk is not in the same project, it will not be attached, so it will not be eligible to be the boot device.

also no longer need authz_project_disk

might return to this later, but as-is i've gotten something wrong with the query..

(and fix a few test references to now-boot_device_id)

"you gotta run cargo xtask openapi generate"

iximeow · 2024-09-30T16:35:12Z

@ahl @david-crespo @karencfv 8b22d56 is the change to have this PR accept an InstanceDiskAttachment rather than just the name of a boot disk.

from an API perspective, the difference would be something like:

current PR:

POST /v1/instances

{
    "boot_disk": {
        "name": "mydisk"
    },
    "disks": [
        { "type": "attach", "name", "mydisk" }
    ],
    ....
}

vs with 8b22d56:

POST /v1/instances

{
    "boot_disk": {
        "type": "attach",
        "name", "mydisk"
    },
    "disks": [],
    ....
}

where since boot_disk is an InstanceDiskAttachment, it's also valid to provide a disk creation request (with an appropriate disk source)

so the biggest wrinkle with 8b22d56 is that we end up with disks: [] in the most likely case of instance creation. i'd kind of prefer that to be data_disks: [], but that would be a breaking change and i'm not sure how eagerly we do that.

@ahl mentions that most of the control plane requests come from other software we maintain (CLI, terraform, etc) so for the time being we can be pretty OK with breaking changes. from @karencfv it sounds that validating the Terraform provider at this point might be more work than we expected a breaking change to be.

either way i'm inclined to pick the commit but i'm less confident about changing disks. what do yall think?

karencfv · 2024-09-30T18:35:09Z

Thanks @iximeow for putting in all the effort to make this a much intuitive API to work with!

I'm 100% on board with

POST /v1/instances

{
    "boot_disk": {
        "type": "attach",
        "name", "mydisk"
    },
    "disks": [],
    ....
}

It's much more intuitive this way. I'd assume that in this scenario the user would get an error if they set the same disk in boot_disk and disk?

Regarding:

so the biggest wrinkle with 8b22d56 is that we end up with disks: [] in the most likely case of instance creation. i'd kind of prefer that to be data_disks: [], but that would be a breaking change and i'm not sure how eagerly we do that.

If it's just changing the name of the field and none of the functionality, I think it wouldn't be as bad as I thought from the provider side. But there is the issue where we would break all of the terraform configuration files of the customers as they'd have to migrate their HCL config files to use data_disks instead of disks. Is this something we're OK with?

karencfv · 2024-09-30T18:35:48Z

perhaps we can think about it a bit more and hold the breaking change for a later release?

david-crespo · 2024-09-30T20:23:59Z

I'm on the fence. I like the data_disks change for making clear up front why there are two disk-related keys, so I agree we should do it either now or for v12.

I'm concerned about it being annoying to customers for any existing scripts to break as soon as the rack comes back up with after the v11 upgrade. How do we minimize that pain? It feels weird to say this because we break APIs all the time, but I don't have a good feel for how that goes for customers. Do we just warn them hey, make sure you upgrade to the latest CLI and SDKs and Terraform (the release notes always say this at the top), and they do it and it's fine?

I tried to game out whether we gain something by waiting until next release to break the API, but in most scenarios I could think of, we end up incurring the same pain later as now, so we might as well do it now. For example: if most instance creates in the real world use one disk, and it's the boot disk, in theory people can move that disk attachment over to boot_disk (at their leisure, some time between v11 and v12), leaving disks empty, so when we then change that to data_disks in the next release, most people won't have to change anything. But in this situation (one disk), by stipulation, the change is not mandatory and doesn't change the behavior, so why would anyone do it until forced to in the next release when the API breaks?

karencfv · 2024-09-30T20:34:01Z

I'm concerned about it being annoying to customers for any existing scripts to break as soon as the rack comes back up with after the v11 upgrade. How do we minimize that pain? It feels weird to say this because we break APIs all the time, but I don't have a good feel for how that goes for customers. Do we just warn them hey, make sure you upgrade to the latest CLI and SDKs and Terraform (the release notes always say this at the top), and they do it and it's fine?

Yeah, generally I list any breaking change as part of the changelog and release notes. I might have not phrased it correctly, but it wasn't so much saying that we shouldn't change it because it's a breaking change, but rather just point out that in this case it would be a breaking change the customers would see and have to take action.

I tried to game out whether we gain something by waiting until next release to break the API, but in most scenarios I could think of, we end up incurring the same pain later as now, so we might as well do it now. For example: if most instance creates in the real world use one disk, and it's the boot disk, in theory people can move that disk attachment over to boot_disk (at their leisure, some time between v11 and v12), leaving disks empty, so when we then change that to data_disks in the next release, most people won't have to change anything. But in this situation (one disk), by stipulation, the change is not mandatory and doesn't change the behavior, so why would anyone do it until forced to in the next release when the API breaks?

Sorry, didn't mean to imply the pain would be less, but rather that perhaps we may want to give ourselves time to think if this is a breaking change we think is worth making or not. Mostly because in this case the customers would feel it and the terraform instance resource is one of the most used ones, and the release date is so soon.

iximeow · 2024-09-30T20:34:22Z

leaving disks empty, so when we then change that to data_disks in the next release, most people won't have to change anything.

this makes me realize that since disks is non-optional, changing it to data_disks means even an empty disks would need to be changed to an empty data_disks. changing the field and making it optional means we might happily ignore an old request that tried to send disks, but we saw as having data_disks: None.

mixed bag, there..

also, @karencfv, re.

I'd assume that in this scenario the user would get an error if they set the same disk in boot_disk and disk?

this was a behavior that surprised me as well - we actually accept duplicate entries in disks today, and since boot_disk is treated as just another disk for all disk creation/attachment purposes, if it duplicates an entry in disks that is also accepted. i'm not sure if this is something we especially desire, but i've added a test for it so that we know if we change it. the test also notes that noncommittal stance in case someone finds it in the future :)

karencfv · 2024-09-30T20:44:28Z

this makes me realize that since disks is non-optional, changing it to data_disks means even an empty disks would need to be changed to an empty data_disks. changing the field and making it optional means we might happily ignore an old request that tried to send disks, but we saw as having data_disks: None.

mixed bag, there..

Hm, yeah, that'd wouldn't be great. Especially if an user forgot to upgrade their provider and is managing a large amount of instances 🤔

this was a behavior that surprised me as well - we actually accept duplicate entries in disks today, and since boot_disk is treated as just another disk for all disk creation/attachment purposes, if it duplicates an entry in disks that is also accepted. i'm not sure if this is something we especially desire, but i've added a test for it so that we know if we change it. the test also notes that noncommittal stance in case someone finds it in the future :)

Ha! TIL. I guess if duplicates have been accepted all along then not a huge issue here 🤷‍♀️ . It'd probably be nice to have some sort of validation to make sure duplicates are not accepted but, that's probably for another time as it's not really up to this PR to fix.

hawkw · 2024-09-30T22:01:52Z

nexus/db-queries/src/db/datastore/instance.rs

+                        }
+                    }
+
+                    // if and when `Update` can update other fields, set them


i believe this has been renamed?

Suggested change

// if and when `Update` can update other fields, set them

// if and when `Reconfigure` can update other fields, set them

gjcolombo

The internal mechanics of this all look great to me (and I've got a good sense for what I'd need to do to remake #6321 on the back of the update API you've added here). Thanks for putting all the time in on this one!

gjcolombo · 2024-09-30T23:40:43Z

nexus/src/app/sagas/instance_create.rs

+        .await
+        .map_err(ActionError::action_failed)?;
+
+    // If there was a boot disk, clear it. If there was not a boot disk,


Would it be too clever to try to look in the params to see if a boot disk was specified? Saves us from having to do the extra database calls to unset something we never bothered to set.

jmpesp · 2024-10-01T01:24:58Z

you can specify a disk to be attached multiple times in disks today, when creating an instance, and we're fine with it!

I'm surprised we allow this, and additionally very surprised this works - I would have thought that the activations for multiply attached volumes would stomp all over each other (see https://rfd.shared.oxide.computer/rfd/0177#_activation for more on this). Was this tested with a real instance?

iximeow · 2024-10-01T02:34:20Z

Was this tested with a real instance?

not at the time, but at least through the control plane on my workstation it seems to work:

oxide.rs> jq . < body.json
{
  "name": "bootorder",
  "description": "multidisk test",
  "hostname": "bootorder",
  "memory": 1073741824,
  "ncpus": 2,
  "disks": [
    {
      "type": "attach",
      "name": "bootorder-4b94abf9-8bff-4b5f-9612-469-3ded17"
    },
    {
      "type": "attach",
      "name": "bootorder-4b94abf9-8bff-4b5f-9612-469-3ded17"
    }
  ]
}


oxide.rs> ./target/release/oxide --profile recovery2 instance create --project test --json-body body.json
{
  "auto_restart_enabled": true,
  "description": "multidisk test",
  "hostname": "bootorder",
  "id": "76081c8d-5391-41ec-889f-9a14ebd043d6",
  "memory": 1073741824,
  "name": "bootorder",
  "ncpus": 2,
  "project_id": "534bd3ae-7244-4324-8e6f-b6d8140b5079",
  "run_state": "starting",
  "time_created": "2024-10-01T02:28:17.156047Z",
  "time_modified": "2024-10-01T02:28:17.156047Z",
  "time_run_state_updated": "2024-10-01T02:28:20.481057Z"
}

oxide.rs> ./target/release/oxide --profile recovery2 instance list --project test
[
  {
    "auto_restart_enabled": true,
    "description": "multidisk test",
    "hostname": "bootorder",
    "id": "76081c8d-5391-41ec-889f-9a14ebd043d6",
    "memory": 1073741824,
    "name": "bootorder",
    "ncpus": 2,
    "project_id": "534bd3ae-7244-4324-8e6f-b6d8140b5079",
    "run_state": "running",
    "time_created": "2024-10-01T02:28:17.156047Z",
    "time_modified": "2024-10-01T02:28:17.156047Z",
    "time_run_state_updated": "2024-10-01T02:28:32.422132Z"
  }
]

i'm quite certain if the duplicate entries were to create new disks it would fail for reasons like you indicate (if nothing else, duplicate names should prevent rows from being created, let alone getting to any real Crucibles). but for attachment, it looks to me like we just end up thinking the already-successful attach is unremarkable?

karencfv

Thanks for the huge effort to get this feature rolled out! The user facing side of the API looks great to me!

jmpesp · 2024-10-01T03:14:26Z

I'd like to test this before this gets merged if that's ok, I'll do it first thing tomorrow morning.

jmpesp · 2024-10-01T15:16:55Z

for attachment, it looks to me like we just end up thinking the already-successful attach is unremarkable?

Yep, sending a POST body with multiple of the same attach statement doesn't make its way to propolis, I'd be willing to bet the code you highlighted is what's responsible.

So: surprising that Nexus accepts that POST body but we do the right thing. 🚢

Greg and Eliza were both right in comments on #6585, but since these are both fully internal I didn't want to add another CI round trip there :)

This commit extends the `instance-reconfigure` API endpoint added in #6585 to also allow setting instance auto-restart policies (as added in #6503). I've also added the actual auto-restart policy to the external API instance view, along with the boolean `auto_restart_enabled` added in #6503. This way, it's possible to change just the boot disk by providing the current auto-restart policy in an instance POST.

in #6585 i'd unintentionally removed the fragment from `bhyve_api`'s source entry in Cargo.lock. it's not _wrong_, since it's still specifying the desired `11371b0...`, but without the fragment at the end of the source Cargo parses this as a [non-precise reference](https://github.com/rust-lang/cargo/blob/d9c14e664e994ea5cf28e49000231a4b2734d8e7/src/cargo/core/source_id.rs#L158-L165) and updates the git repo tracking this dependency every time it resolves dependdencies. so, put the fragment back and be very clear to Cargo that once it has `11371b0...` it does not need to do more git operations to try getting a newer version of the commit. (this makes a bit more sense if you imagine source urls like `git+https://github.com/foo.git?rev=main#1234`: in the case where `rev` _is_ a commit it's arguably a bug to fetch from remotes when you already have the commit..) Cargo generally handles this automatically, compare with [other version bumps](2c79661). i suspect i got the lockfile in this state by resolving a merge conflict incorrectly.

This commit extends the `instance-reconfigure` API endpoint added in #6585 to also allow setting instance auto-restart policies (as added in #6503). I've also added the actual auto-restart policy to the external API instance view, along with the boolean `auto_restart_enabled` added in #6503. This way, it's possible to change just the boot disk by providing the current auto-restart policy in an instance POST.

iximeow commented Sep 16, 2024

View reviewed changes

nexus/db-queries/src/db/datastore/disk.rs Outdated Show resolved Hide resolved

iximeow commented Sep 17, 2024

View reviewed changes

openapi/sled-agent.json Outdated Show resolved Hide resolved

iximeow commented Sep 17, 2024

View reviewed changes

schema/crdb/dbinit.sql Outdated Show resolved Hide resolved

karencfv mentioned this pull request Sep 17, 2024

Disk attached to an instance should honor the order of disks passed from yaml oxidecomputer/terraform-provider-oxide#338

Closed

2 tasks

iximeow added 3 commits September 17, 2024 06:17

first brush at processing primary boot device selection in control plane

94908e7

actually set up crdb properly with new column

55c2fe2

rustfmt

3e5ba81

david-crespo reviewed Sep 17, 2024

View reviewed changes

nexus/types/src/external_api/params.rs Show resolved Hide resolved

david-crespo mentioned this pull request Sep 18, 2024

Boot order UI oxidecomputer/console#2335

Closed

iximeow commented Sep 18, 2024

View reviewed changes

schema/crdb/add-instance-boot-device/up.sql Outdated Show resolved Hide resolved

gjcolombo self-requested a review September 19, 2024 20:31

make boot_device uuid, constrain disk detach and boot_device assignment

1a70c9f

david-crespo reviewed Sep 21, 2024

View reviewed changes

schema/crdb/dbinit.sql Show resolved Hide resolved

iximeow force-pushed the ixi/boot-order branch from 3a65608 to 1b21005 Compare September 21, 2024 02:51

iximeow added 13 commits September 21, 2024 06:40

setting boot device does not have the instance_attach_disk problem

097a961

if the disk is not in the same project, it will not be attached, so it will not be eligible to be the boot device.

cleanup, also figured out why rustfmt wasnt respecting 80col

5f8844c

rustfmt, line wrapping

7b7c290

first_async introduced some types issues, oops

a059312

also no longer need authz_project_disk

unwind deteching if a boot disk is present and unattached vs nonexistant

b7833bf

might return to this later, but as-is i've gotten something wrong with the query..

allow reconfiguration of instances in Creating

53873aa

(and fix a few test references to now-boot_device_id)

include the field rename...

cae16ea

line wrapping, update nexus_tags.txt

662401d

update endpoint is a PUT on the resource, add unauthorized coverage

efb34a5

s/boot_device/boot_disk/g

4eb1ff5

and the column gets a s/boot_disk/boot_disk_id/g

da9e29c

how did that not rustfmt before???

1b21005

iximeow mentioned this pull request Sep 21, 2024

consider renaming "instance update" saga to "instance reconciliation" #6631

Open

instance create API: boot disk name -> boot disk attachment

8b22d56

"you gotta run cargo xtask openapi generate"

iximeow marked this pull request as ready for review September 30, 2024 16:39

Merge branch 'main' into ixi/boot-order

899f781

hawkw approved these changes Sep 30, 2024

View reviewed changes

gjcolombo approved these changes Sep 30, 2024

View reviewed changes

karencfv approved these changes Oct 1, 2024

View reviewed changes

iximeow merged commit 144e91a into main Oct 1, 2024
19 checks passed

iximeow deleted the ixi/boot-order branch October 1, 2024 17:13

iximeow added a commit that referenced this pull request Oct 1, 2024

Internal cleanup for boot disk/instance update

265068b

Greg and Eliza were both right in comments on #6585, but since these are both fully internal I didn't want to add another CI round trip there :)

iximeow added a commit that referenced this pull request Oct 1, 2024

Internal cleanup for boot disk/instance update

41d2e2f

Greg and Eliza were both right in comments on #6585, but since these are both fully internal I didn't want to add another CI round trip there :)

iximeow mentioned this pull request Oct 1, 2024

Internal cleanup for boot disk/instance update #6738

Open

hawkw mentioned this pull request Oct 1, 2024

[nexus] allow reconfiguring auto-restart policies #6743

Merged

iximeow mentioned this pull request Oct 2, 2024

be precise in git dependency specification #6756

Merged

iximeow mentioned this pull request Oct 8, 2024

Adding a disk to an instance changed the boot order #5112

Open

david-crespo mentioned this pull request Oct 8, 2024

instance from-image should set the disk as the boot disk oxidecomputer/oxide.rs#873

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add a boot device field to Instance, forward along as appropriate #6585

Add a boot device field to Instance, forward along as appropriate #6585

iximeow commented Sep 16, 2024 •

edited

Loading

iximeow Sep 16, 2024

karencfv Sep 17, 2024

karencfv Sep 17, 2024

david-crespo Sep 17, 2024 •

edited

Loading

iximeow Sep 17, 2024

hawkw Sep 17, 2024

iximeow Sep 18, 2024

david-crespo Sep 18, 2024

karencfv Sep 18, 2024

iximeow commented Sep 17, 2024 •

edited

Loading

iximeow commented Sep 30, 2024

karencfv commented Sep 30, 2024

karencfv commented Sep 30, 2024

david-crespo commented Sep 30, 2024

karencfv commented Sep 30, 2024

iximeow commented Sep 30, 2024

karencfv commented Sep 30, 2024

hawkw Sep 30, 2024

gjcolombo left a comment

gjcolombo Sep 30, 2024

jmpesp commented Oct 1, 2024

iximeow commented Oct 1, 2024

karencfv left a comment

jmpesp commented Oct 1, 2024

jmpesp commented Oct 1, 2024

	// if and when `Update` can update other fields, set them
	// if and when `Reconfigure` can update other fields, set them

Add a boot device field to Instance, forward along as appropriate #6585

Add a boot device field to Instance, forward along as appropriate #6585

Conversation

iximeow commented Sep 16, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

david-crespo Sep 17, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

iximeow commented Sep 17, 2024 • edited Loading

iximeow commented Sep 30, 2024

karencfv commented Sep 30, 2024

karencfv commented Sep 30, 2024

david-crespo commented Sep 30, 2024

karencfv commented Sep 30, 2024

iximeow commented Sep 30, 2024

karencfv commented Sep 30, 2024

Choose a reason for hiding this comment

gjcolombo left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jmpesp commented Oct 1, 2024

iximeow commented Oct 1, 2024

karencfv left a comment

Choose a reason for hiding this comment

jmpesp commented Oct 1, 2024

jmpesp commented Oct 1, 2024

iximeow commented Sep 16, 2024 •

edited

Loading

david-crespo Sep 17, 2024 •

edited

Loading

iximeow commented Sep 17, 2024 •

edited

Loading