Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add a boot device field to Instance, forward along as appropriate #6585

Merged
merged 44 commits into from
Oct 1, 2024

Conversation

iximeow
Copy link
Member

@iximeow iximeow commented Sep 16, 2024

the Omicron side of adding explicit boot order selection to instances (counterpart to propolis#756).

first, this extends params::InstanceCreate to take a new boot_disk: Option<params::InstanceDiskAttachment>.

additionally, this adds a PUT /v1/instances/{instance} to update instances. the only property that can be updated at the moment is boot_disk, pick a new boot disk or unset it entirely. this also partially subsumes #6321.

finally, this updates Omicron to reference a recent enough Propolis that #756 is included.

a surprising discovery along the way: you can specify a disk to be attached multiple times in disks today, when creating an instance, and we're fine with it! this carries through with the new boot_disk field: if you specify the disk as boot_disk and in disks, it is technically listing the disk for attachment twice but this will succeed.

@@ -63,6 +63,10 @@ pub struct Instance {
#[diesel(column_name = auto_restart_policy)]
pub auto_restart_policy: Option<InstanceAutoRestart>,

/// The primary boot device for this instance.
#[diesel(column_name = boot_device)]
pub boot_device: Option<String>,
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if this were just the field it is, it should be Name, since it is .. the name of some other boot device. but on the Propolis side we'll actually accept something more flexible.

@ahl @askfongjojo @karencfv and @david-crespo i'd like your thoughts on how much we should carry that through: recording a BootOrder struct which we can then extend in the future is certainly nice for internal purposes, but i'm less sure how to judge the complexity/extensibility tradeoff for the more user-facing API here.

if we have just "boot device" here it seems straightforward in all uses today, but allowing a future boot order list, or "first boot only"-style settings on boot entries, means we'd need a new field and presumably to allow only one of {boot_device, future_more_complex_list}.

i'm inclined towards keeping just the single optional boot_device string to avoid the extra layers in user-facing APIs, but i don't know if there's other context i be aware of.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Due to names being mutable, I'd prefer the ID of the boot device. For users who rely on automation to manage their infrastructure, and immutable identifier is always preferable. Like how we have project_id as part of this struct :)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i'm inclined towards keeping just the single optional boot_device string to avoid the extra layers in user-facing APIs, but i don't know if there's other context i be aware of.

I agree with this, it seems more future proof as well

Copy link
Contributor

@david-crespo david-crespo Sep 17, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, ID would be much more typical. Often we allow something to be set by NameOrId, but we resolve it to an ID immediately and store it as an ID.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh! i wasn't sure that they were mutable. in that case, yes, the ID would be much better. it turns out the name is what gets sent out to Propolis, so i'd just stuck with that here, but perhaps Propolis should be getting IDs as well.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd definitely advocate for changing the Propolis API to take a UUID, if at all possible.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

when we're creating the instance, we require that InstanceDiskAttach specify a name, and not an ID. we probably ought to take NameOrId there. otherwise allowing both here seems to be of limited utility.. the use cases afaict look like:

  • if you're creating new disks along with the instance creation, you don't have IDs yet, so you couldn't provide an ID to boot_disk. simple enough, strong indicator we should at least take Name here.
  • if you're attaching existing disks, you must use their Name, so passing a disk ID here seems like net-more work than just reusing the name you'd have put in disks[] already.
  • if you're updating the boot_disk of a stopped instance, that seems like the most likely case for an ID on its own to be especially helpful.

i don't want to go change InstanceDiskAttach here too (basically just because i don't know if that is laden with secret cans of worms), but if folks agree it seems like it would be nice to allow NameOrId sooner than later for disk attachment.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, we did intend to change that, but I never got around to wrapping it up after Justin left:

#3671

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

but if folks agree it seems like it would be nice to allow NameOrId sooner than later for disk attachment.

Definitely on board with this!

openapi/sled-agent.json Outdated Show resolved Hide resolved
schema/crdb/dbinit.sql Outdated Show resolved Hide resolved
@iximeow
Copy link
Member Author

iximeow commented Sep 17, 2024

one more note: i do expect to land this with a mechanism to update the boot order on a stopped instance. after reading #6321 this morning it looks like there's consensus around this kind of reconfiguration being a PUT to instance. so i'll expect to add a PUT /v1/instances route here and we can extend it to allow reconfiguring instance vCPUs/memory later.

@gjcolombo gjcolombo self-requested a review September 19, 2024 20:31
additionally, add a general-purpose instance update endpoint to change
the boot device for an existing instance, as well as tests that the
constraints around what we can do with boot devices and the boot device
field are checked.
if the disk is not in the same project, it will not be attached, so it
will not be eligible to be the boot device.
also no longer need authz_project_disk
might return to this later, but as-is i've gotten something wrong with
the query..
(and fix a few test references to now-boot_device_id)
"you gotta run cargo xtask openapi generate"
@iximeow
Copy link
Member Author

iximeow commented Sep 30, 2024

@ahl @david-crespo @karencfv 8b22d56 is the change to have this PR accept an InstanceDiskAttachment rather than just the name of a boot disk.

from an API perspective, the difference would be something like:

current PR:

POST /v1/instances

{
    "boot_disk": {
        "name": "mydisk"
    },
    "disks": [
        { "type": "attach", "name", "mydisk" }
    ],
    ....
}

vs with 8b22d56:

POST /v1/instances

{
    "boot_disk": {
        "type": "attach",
        "name", "mydisk"
    },
    "disks": [],
    ....
}

where since boot_disk is an InstanceDiskAttachment, it's also valid to provide a disk creation request (with an appropriate disk source)

so the biggest wrinkle with 8b22d56 is that we end up with disks: [] in the most likely case of instance creation. i'd kind of prefer that to be data_disks: [], but that would be a breaking change and i'm not sure how eagerly we do that.

@ahl mentions that most of the control plane requests come from other software we maintain (CLI, terraform, etc) so for the time being we can be pretty OK with breaking changes. from @karencfv it sounds that validating the Terraform provider at this point might be more work than we expected a breaking change to be.

either way i'm inclined to pick the commit but i'm less confident about changing disks. what do yall think?

@iximeow iximeow marked this pull request as ready for review September 30, 2024 16:39
@karencfv
Copy link
Contributor

Thanks @iximeow for putting in all the effort to make this a much intuitive API to work with!

I'm 100% on board with

POST /v1/instances

{
    "boot_disk": {
        "type": "attach",
        "name", "mydisk"
    },
    "disks": [],
    ....
}

It's much more intuitive this way. I'd assume that in this scenario the user would get an error if they set the same disk in boot_disk and disk?

Regarding:

so the biggest wrinkle with 8b22d56 is that we end up with disks: [] in the most likely case of instance creation. i'd kind of prefer that to be data_disks: [], but that would be a breaking change and i'm not sure how eagerly we do that.

If it's just changing the name of the field and none of the functionality, I think it wouldn't be as bad as I thought from the provider side. But there is the issue where we would break all of the terraform configuration files of the customers as they'd have to migrate their HCL config files to use data_disks instead of disks. Is this something we're OK with?

@karencfv
Copy link
Contributor

perhaps we can think about it a bit more and hold the breaking change for a later release?

@david-crespo
Copy link
Contributor

I'm on the fence. I like the data_disks change for making clear up front why there are two disk-related keys, so I agree we should do it either now or for v12.

I'm concerned about it being annoying to customers for any existing scripts to break as soon as the rack comes back up with after the v11 upgrade. How do we minimize that pain? It feels weird to say this because we break APIs all the time, but I don't have a good feel for how that goes for customers. Do we just warn them hey, make sure you upgrade to the latest CLI and SDKs and Terraform (the release notes always say this at the top), and they do it and it's fine?

I tried to game out whether we gain something by waiting until next release to break the API, but in most scenarios I could think of, we end up incurring the same pain later as now, so we might as well do it now. For example: if most instance creates in the real world use one disk, and it's the boot disk, in theory people can move that disk attachment over to boot_disk (at their leisure, some time between v11 and v12), leaving disks empty, so when we then change that to data_disks in the next release, most people won't have to change anything. But in this situation (one disk), by stipulation, the change is not mandatory and doesn't change the behavior, so why would anyone do it until forced to in the next release when the API breaks?

@karencfv
Copy link
Contributor

I'm concerned about it being annoying to customers for any existing scripts to break as soon as the rack comes back up with after the v11 upgrade. How do we minimize that pain? It feels weird to say this because we break APIs all the time, but I don't have a good feel for how that goes for customers. Do we just warn them hey, make sure you upgrade to the latest CLI and SDKs and Terraform (the release notes always say this at the top), and they do it and it's fine?

Yeah, generally I list any breaking change as part of the changelog and release notes. I might have not phrased it correctly, but it wasn't so much saying that we shouldn't change it because it's a breaking change, but rather just point out that in this case it would be a breaking change the customers would see and have to take action.

I tried to game out whether we gain something by waiting until next release to break the API, but in most scenarios I could think of, we end up incurring the same pain later as now, so we might as well do it now. For example: if most instance creates in the real world use one disk, and it's the boot disk, in theory people can move that disk attachment over to boot_disk (at their leisure, some time between v11 and v12), leaving disks empty, so when we then change that to data_disks in the next release, most people won't have to change anything. But in this situation (one disk), by stipulation, the change is not mandatory and doesn't change the behavior, so why would anyone do it until forced to in the next release when the API breaks?

Sorry, didn't mean to imply the pain would be less, but rather that perhaps we may want to give ourselves time to think if this is a breaking change we think is worth making or not. Mostly because in this case the customers would feel it and the terraform instance resource is one of the most used ones, and the release date is so soon.

@iximeow
Copy link
Member Author

iximeow commented Sep 30, 2024

leaving disks empty, so when we then change that to data_disks in the next release, most people won't have to change anything.

this makes me realize that since disks is non-optional, changing it to data_disks means even an empty disks would need to be changed to an empty data_disks. changing the field and making it optional means we might happily ignore an old request that tried to send disks, but we saw as having data_disks: None.

mixed bag, there..

also, @karencfv, re.

I'd assume that in this scenario the user would get an error if they set the same disk in boot_disk and disk?

this was a behavior that surprised me as well - we actually accept duplicate entries in disks today, and since boot_disk is treated as just another disk for all disk creation/attachment purposes, if it duplicates an entry in disks that is also accepted. i'm not sure if this is something we especially desire, but i've added a test for it so that we know if we change it. the test also notes that noncommittal stance in case someone finds it in the future :)

@karencfv
Copy link
Contributor

this makes me realize that since disks is non-optional, changing it to data_disks means even an empty disks would need to be changed to an empty data_disks. changing the field and making it optional means we might happily ignore an old request that tried to send disks, but we saw as having data_disks: None.

mixed bag, there..

Hm, yeah, that'd wouldn't be great. Especially if an user forgot to upgrade their provider and is managing a large amount of instances 🤔

this was a behavior that surprised me as well - we actually accept duplicate entries in disks today, and since boot_disk is treated as just another disk for all disk creation/attachment purposes, if it duplicates an entry in disks that is also accepted. i'm not sure if this is something we especially desire, but i've added a test for it so that we know if we change it. the test also notes that noncommittal stance in case someone finds it in the future :)

Ha! TIL. I guess if duplicates have been accepted all along then not a huge issue here 🤷‍♀️ . It'd probably be nice to have some sort of validation to make sure duplicates are not accepted but, that's probably for another time as it's not really up to this PR to fix.

}
}

// if and when `Update` can update other fields, set them
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i believe this has been renamed?

Suggested change
// if and when `Update` can update other fields, set them
// if and when `Reconfigure` can update other fields, set them

Copy link
Contributor

@gjcolombo gjcolombo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The internal mechanics of this all look great to me (and I've got a good sense for what I'd need to do to remake #6321 on the back of the update API you've added here). Thanks for putting all the time in on this one!

.await
.map_err(ActionError::action_failed)?;

// If there was a boot disk, clear it. If there was not a boot disk,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it be too clever to try to look in the params to see if a boot disk was specified? Saves us from having to do the extra database calls to unset something we never bothered to set.

@jmpesp
Copy link
Contributor

jmpesp commented Oct 1, 2024

you can specify a disk to be attached multiple times in disks today, when creating an instance, and we're fine with it!

I'm surprised we allow this, and additionally very surprised this works - I would have thought that the activations for multiply attached volumes would stomp all over each other (see https://rfd.shared.oxide.computer/rfd/0177#_activation for more on this). Was this tested with a real instance?

@iximeow
Copy link
Member Author

iximeow commented Oct 1, 2024

Was this tested with a real instance?

not at the time, but at least through the control plane on my workstation it seems to work:

oxide.rs> jq . < body.json
{
  "name": "bootorder",
  "description": "multidisk test",
  "hostname": "bootorder",
  "memory": 1073741824,
  "ncpus": 2,
  "disks": [
    {
      "type": "attach",
      "name": "bootorder-4b94abf9-8bff-4b5f-9612-469-3ded17"
    },
    {
      "type": "attach",
      "name": "bootorder-4b94abf9-8bff-4b5f-9612-469-3ded17"
    }
  ]
}


oxide.rs> ./target/release/oxide --profile recovery2 instance create --project test --json-body body.json
{
  "auto_restart_enabled": true,
  "description": "multidisk test",
  "hostname": "bootorder",
  "id": "76081c8d-5391-41ec-889f-9a14ebd043d6",
  "memory": 1073741824,
  "name": "bootorder",
  "ncpus": 2,
  "project_id": "534bd3ae-7244-4324-8e6f-b6d8140b5079",
  "run_state": "starting",
  "time_created": "2024-10-01T02:28:17.156047Z",
  "time_modified": "2024-10-01T02:28:17.156047Z",
  "time_run_state_updated": "2024-10-01T02:28:20.481057Z"
}

oxide.rs> ./target/release/oxide --profile recovery2 instance list --project test
[
  {
    "auto_restart_enabled": true,
    "description": "multidisk test",
    "hostname": "bootorder",
    "id": "76081c8d-5391-41ec-889f-9a14ebd043d6",
    "memory": 1073741824,
    "name": "bootorder",
    "ncpus": 2,
    "project_id": "534bd3ae-7244-4324-8e6f-b6d8140b5079",
    "run_state": "running",
    "time_created": "2024-10-01T02:28:17.156047Z",
    "time_modified": "2024-10-01T02:28:17.156047Z",
    "time_run_state_updated": "2024-10-01T02:28:32.422132Z"
  }
]

i'm quite certain if the duplicate entries were to create new disks it would fail for reasons like you indicate (if nothing else, duplicate names should prevent rows from being created, let alone getting to any real Crucibles). but for attachment, it looks to me like we just end up thinking the already-successful attach is unremarkable?

Copy link
Contributor

@karencfv karencfv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the huge effort to get this feature rolled out! The user facing side of the API looks great to me!

@jmpesp
Copy link
Contributor

jmpesp commented Oct 1, 2024

I'd like to test this before this gets merged if that's ok, I'll do it first thing tomorrow morning.

@jmpesp
Copy link
Contributor

jmpesp commented Oct 1, 2024

for attachment, it looks to me like we just end up thinking the already-successful attach is unremarkable?

Yep, sending a POST body with multiple of the same attach statement doesn't make its way to propolis, I'd be willing to bet the code you highlighted is what's responsible.

So: surprising that Nexus accepts that POST body but we do the right thing. 🚢

@iximeow iximeow merged commit 144e91a into main Oct 1, 2024
19 checks passed
@iximeow iximeow deleted the ixi/boot-order branch October 1, 2024 17:13
iximeow added a commit that referenced this pull request Oct 1, 2024
Greg and Eliza were both right in comments on #6585, but since these are
both fully internal I didn't want to add another CI round trip there :)
iximeow added a commit that referenced this pull request Oct 1, 2024
Greg and Eliza were both right in comments on #6585, but since these are
both fully internal I didn't want to add another CI round trip there :)
hawkw added a commit that referenced this pull request Oct 1, 2024
This commit extends the `instance-reconfigure` API endpoint added in
#6585 to also allow setting instance auto-restart policies (as added in
#6503).

I've also added the actual auto-restart policy to the external API
instance view, along with the boolean `auto_restart_enabled` added in
#6503. This way, it's possible to change just the boot disk by providing
the current auto-restart policy in an instance POST.
hawkw added a commit that referenced this pull request Oct 2, 2024
This commit extends the `instance-reconfigure` API endpoint added in
#6585 to also allow setting instance auto-restart policies (as added in
#6503).

I've also added the actual auto-restart policy to the external API
instance view, along with the boolean `auto_restart_enabled` added in
#6503. This way, it's possible to change just the boot disk by providing
the current auto-restart policy in an instance POST.
iximeow added a commit that referenced this pull request Oct 2, 2024
in #6585 i'd unintentionally removed the fragment from `bhyve_api`'s
source entry in Cargo.lock. it's not _wrong_, since it's still
specifying the desired `11371b0...`, but without the fragment at the end
of the source Cargo parses this as a [non-precise
reference](https://github.com/rust-lang/cargo/blob/d9c14e664e994ea5cf28e49000231a4b2734d8e7/src/cargo/core/source_id.rs#L158-L165)
and updates the git repo tracking this dependency every time it resolves
dependdencies. so, put the fragment back and be very clear to Cargo that
once it has `11371b0...` it does not need to do more git operations to
try getting a newer version of the commit.

(this makes a bit more sense if you imagine source urls like
`git+https://github.com/foo.git?rev=main#1234`: in the case where `rev`
_is_ a commit it's arguably a bug to fetch from remotes when you already
have the commit..)

Cargo generally handles this automatically, compare with [other version
bumps](2c79661).
i suspect i got the lockfile in this state by resolving a merge conflict
incorrectly.
hawkw added a commit that referenced this pull request Oct 3, 2024
This commit extends the `instance-reconfigure` API endpoint added in
#6585 to also allow setting instance auto-restart policies (as added in
#6503).

I've also added the actual auto-restart policy to the external API
instance view, along with the boolean `auto_restart_enabled` added in
#6503. This way, it's possible to change just the boot disk by providing
the current auto-restart policy in an instance POST.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants