-
Notifications
You must be signed in to change notification settings - Fork 40
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
[nexus] Reincarnate instances with
SagaUnwound
VMMs (#6669)
When an `instance-start` saga unwinds, any VMM it created transitions to the `SagaUnwound` state. This causes the instance's effective state to appear as `Failed` in the external API. PR #6503 added functionality to Nexus to automatically restart instances that are in the `Failed` state ("instance reincarnation"). However, the current instance-reincarnation task will _not_ automatically restart instances whose instance-start sagas have unwound, because such instances are not actually in the `Failed` state from Nexus' perspective. This PR implements reincarnation for instances whose `instance-start` sagas have failed. This is done by changing the `instance_reincarnation` background task to query the database for instances which have `SagaUnwound` active VMMs, and then run `instance-start` sagas for them identically to how it runs start sagas for `Failed` instances. I decided to perform two separate queries to list `Failed` instances and to list instances with `SagaUnwound` VMMs, because the `SagaUnwound` query requires a join with the `vmm` table, and I thought it was a bit nicer to be able to find `Failed` instances without having to do the join, and only do it when looking for `SagaUnwound` ones. Also, having two queries makes it easier to distinguish between `Failed` and `SagaUnwound` instances in logging and the OMDB status output. This ended up being implemented by adding a parameter to the `DataStore::find_reincarnatable_instances` method that indicates which category of instances to select; I had previously considered making the method on the `InstanceReincarnation` struct that finds instances and reincarnates them take the query as a `Fn` taking the datastore and `DataPageParams` and returning an `impl Future` outputting `Result<Vec<Instance>, ...>`,but figuring out generic lifetimes for the pagination stuff was annoying enough that this felt like the simpler choice. Fixes #6638
- Loading branch information
Showing
10 changed files
with
633 additions
and
267 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.