Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

leverage container-image-reference-digest from rpm-ostree for OCI updates #1272

Open
dustymabe opened this issue Mar 3, 2025 · 10 comments
Open
Assignees

Comments

@dustymabe
Copy link
Member

dustymabe commented Mar 3, 2025

Was discussing with @jlebon and found out that once we have switched and are using OCI for updates we'll be in a position where the analog of sudo rpm-ostree rebase "fedora/${ARCH}/coreos/${STREAM}" won't really work.

i.e. if you sudo rpm-ostree rebase ostree-remote-image:fedora:registry:quay.io/fedora/fedora-coreos:stable then your zincati updates will break.

I think we need to still keep this working and I think we can. Enough information is in the rpm-ostree status --json output:

[core@cosa-devsh ~]$ rpm-ostree status --json | grep container-image
      "container-image-reference-digest" : "sha256:b5bdaa44a45084bf572c8c0f48d39c30b6c365941e86e60fcefb65aa9836ed9c",
      "container-image-reference" : "ostree-remote-image:fedora:docker://quay.io/fedora/fedora-coreos:rawhide",

So we have the container digest as well as the name we are following so we should be able to pull off the tag from what we are following and use the digest instead.

We could actually lean in to this strategy even more by relying on it in zincati by default (i.e. not just handling a corner case but making it part of the strategy), which would allow for more information to be shown to the user without having to leverage CustomOrigin:

i.e. this:

[core@cosa-devsh ~]$ rpm-ostree status 
State: idle
Deployments:
● ostree-remote-image:fedora:registry://quay.io/fedora/fedora-coreos:testing
                   Digest: sha256:b5bdaa44a45084bf572c8c0f48d39c30b6c365941e86e60fcefb65aa9836ed9c
                  Version: 43.20250303.dev.0 (2025-03-03T19:41:42Z)

versus:

[core@cosa-devsh ~]$ rpm-ostree status 
State: idle
Deployments:
● quay.io/fedora/fedora-coreos@sha256:b5bdaa44a45084bf572c8c0f48d39c30b6c365941e86e60fcefb65aa9836ed9c
             CustomOrigin: Fedora CoreOS testing stream
                  Version: 41.20250215.1.0 (2025-02-17T11:44:02Z)

It shows you what "stream" we are following along with the actual digest instead of relying on customOrigin.

@cgwalters
Copy link
Member

--experimental

(tangent) You don't need this anymore for quite some time, where did you find it?

then your zincati updates will break.

Can you be more specific? What breaks? Why?

@jlebon
Copy link
Member

jlebon commented Mar 3, 2025

Discussing this further with Dusty, we realized we kinda need this actually before we can even switch over next bootimages to deploy-via-container. The container-imgref we'd use in image.yaml is a tagged refspec, so the node won't be able to find itself in the graph. So either we'd need to change cosa to support deploying by digested pullspec (+ custom origin for parity with migrated nodes), or we just implement this.

It does seem like a cleaner UX to implement this.

But this does also intersect with the idea I had in coreos/rpm-ostree#5120 (comment) to point the custom URL to the OCI graph. What's nice with that is that the origin configuration is the Zincati graph configuration. They're naturally coupled. But OTOH, it makes for an awkward UX because it wouldn't work, as one would have expected, to just do e.g. rpm-ostree rebase quay.io/fedora/fedora-coreos:stable-graph to rebase because rpm-ostree itself wouldn't understand it. It'd need to be --custom-url ..., which is quite awkward. (Or Zincati would have to provide a wrapper CLI.)

So overall, I lean more towards not doing that now and instead just have the actual tagged refspec in the origin, and have Zincati learn to look at container-image-reference-digest when looking up the booted version in the graph.

@dustymabe
Copy link
Member Author

dustymabe commented Mar 3, 2025

--experimental

(tangent) You don't need this anymore for quite some time, where did you find it?

copy/pasta from local notes. I'll remove it from them.

then your zincati updates will break.

Can you be more specific? What breaks? Why?

IIUC it's because we're using the container_image_reference string and directly comparing that to what's in the update graph under the payload: key:

{
  "nodes": [
    {
      "version": "41.20241122.1.0",
      "metadata": {
        "org.fedoraproject.coreos.releases.age_index": "166",
        "org.fedoraproject.coreos.scheme": "oci"
      },
      "payload": "quay.io/fedora/fedora-coreos@sha256:edca668671b85a471b80e52851421b353236a7739f51c5e3c373bd9bd97849b3"
    },
    {
      "version": "41.20241215.1.0",
      "metadata": {
        "org.fedoraproject.coreos.releases.age_index": "167",
        "org.fedoraproject.coreos.scheme": "oci"
      },
      "payload": "quay.io/fedora/fedora-coreos@sha256:1a534a3956bd82e517cdb68bb76880d58aa047292e21f0dc2e2e2aaf18eef161"
    },
    {
      "version": "41.20250105.1.1",
      "metadata": {
        "org.fedoraproject.coreos.scheme": "oci",
        "org.fedoraproject.coreos.releases.age_index": "168"
      },
      "payload": "quay.io/fedora/fedora-coreos@sha256:3498c2001f853613ba149acd109413d4d37b4a5e670fbef65beb2f8a6a1febfc"
    },
    {
      "version": "41.20250117.1.0",
      "metadata": {
        "org.fedoraproject.coreos.scheme": "oci",
        "org.fedoraproject.coreos.releases.age_index": "169"
      },
      "payload": "quay.io/fedora/fedora-coreos@sha256:3e79f4635ee3187ab0958c58219aa36a122950a8a7ecee22e37735b381cf61db"
    },
    {
      "version": "41.20250130.1.0",
      "metadata": {
        "org.fedoraproject.coreos.scheme": "oci",
        "org.fedoraproject.coreos.releases.age_index": "170",
        "org.fedoraproject.coreos.updates.rollout": "true",
        "org.fedoraproject.coreos.updates.start_value": "1"
      },
      "payload": "quay.io/fedora/fedora-coreos@sha256:fcd6c0e85b1f80ba23b01d280db9f3e273ba9e4bfde9d00820d5141404ae0918"
    },
    {
      "version": "41.20250215.1.0",
      "metadata": {
        "org.fedoraproject.coreos.scheme": "oci",
        "org.fedoraproject.coreos.updates.start_value": "0",
        "org.fedoraproject.coreos.updates.start_epoch": "1739901600",
        "org.fedoraproject.coreos.updates.rollout": "true",
        "org.fedoraproject.coreos.releases.age_index": "171",
        "org.fedoraproject.coreos.updates.duration_minutes": "2880"
      },
      "payload": "quay.io/fedora/fedora-coreos@sha256:40c0e515ed93cb8a9d5a773fb5a206b13e2edda02d92c6011a84399ceb31ed03"
    }
  ],
  "edges": [
    [
      0,
      5
    ],
    [
      1,
      5
    ],
    [
      2,
      5
    ],
    [
      3,
      5
    ],
    [
      4,
      5
    ],
    [
      0,
      4
    ],
    [
      1,
      4
    ],
    [
      2,
      4
    ],
    [
      3,
      4
    ]
  ]
}

so if someone rebases with sudo rpm-ostree rebase ostree-remote-registry:fedora:quay.io/fedora/fedora-coreos:stable the container_image_reference string will no longer match any nodes in the graph and we'll just report that no upgrade is available.

@dustymabe
Copy link
Member Author

Note that a side effect of what i suggest in the description here will cause rpm-ostree upgrade --bypass-driver to start working again. We don't necessarily want people to keep using this, but at least it's not worse than what we currently have.

@jbtrystram jbtrystram self-assigned this Mar 4, 2025
@jbtrystram
Copy link
Contributor

Nice find.

So overall, I lean more towards not doing that now and instead just have the actual tagged refspec in the origin, and have Zincati learn to look at container-image-reference-digest when looking up the booted version in the graph.

That should be quite straightforward to do.

@jbtrystram
Copy link
Contributor

jbtrystram commented Mar 5, 2025

Ok so I started to make the code change to do this. Pulling the digest from container-image-reference-digest so we can find ourselves in the update graph is the easy part.
But then, we still want to rebase to a specific digest to be able to do barriers.
And if you do so :

So yes, someone can rebase to ostree-remote-image:fedora:registry:quay.io/fedora/fedora-coreos:stable or even a version tag and the updates will work from there.
But you still won't be able to know what stream you are on in the output (except if you know how to parse the version number) :

[core@cosa-devsh ~]$ sudo rpm-ostree status
State: idle
Deployments:
● ostree-remote-image:fedora:registry:quay.io/fedora/fedora-coreos@sha256:74448159fae255797012a12eaf9089ea3c1018ffb0b3e76e03483cba59b50bb2
                   Digest: sha256:74448159fae255797012a12eaf9089ea3c1018ffb0b3e76e03483cba59b50bb2
                  Version: 41.20241215.2.0 (2024-12-17T00:07:38Z)

We would need something like rpm-ostree rebase ostree-remote-image:fedora:registry:quay.io/fedora/fedora-coreos:stable --digest abcd123...
Or we keep using the custom origin for now.

@cgwalters
Copy link
Member

We would need something like rpm-ostree rebase ostree-remote-image:fedora:registry:quay.io/fedora/fedora-coreos:stable --digest abcd123...

bootc-dev/bootc#1165

jbtrystram added a commit to jbtrystram/zincati that referenced this issue Mar 5, 2025
This pull the container digest from the rpm-ostree status, independently
of what `container-image-reference` is.
This value is then used to find ourselves in the update graph.

This allow to rebase to a tag, (version or stream) while still allowing
zincati to pick-up updates.

However, if no custom description is given, rpm-ostree status won't
easily show what stream the node is following:

```
[root@cosa-devsh core]# rpm-ostree status
State: idle
AutomaticUpdatesDriver: Zincati
  DriverState: active; update staged: 41.20250215.2.0; reboot delayed due to active user sessions
Deployments:
  ostree-remote-image:fedora:docker://quay.io/fedora/fedora-coreos@sha256:e60a16be2dd7e6fdcb30e61a452ced46d0e8e473810d25f82a34eee78f6ce430
                   Digest: sha256:e60a16be2dd7e6fdcb30e61a452ced46d0e8e473810d25f82a34eee78f6ce430
                  Version: 41.20250215.2.0 (2025-02-17T11:43:37Z)
                     Diff: 125 upgraded

● ostree-remote-image:fedora:registry:quay.io/fedora/fedora-coreos:41.20250105.2.0
                   Digest: sha256:c4a15145a232d882ccf2ed32d22c06c01a7cf62317eb966a98340ae4bd56dfa6
                  Version: 41.20250105.2.0 (2025-01-06T19:46:25Z)
```
See coreos#1272 for more details.
@cgwalters
Copy link
Member

I would also reiterate here while I totally understand the rationale in keeping the upgrade graph in the short and even medium term, I am still skeptical that it should be required in the longer term versus something like bootc-dev/bootc#640 for transitions, and using container tags to encode barriers and control upgrades in general. The latter especially is the guidance I'd give to bootc users.

@jlebon
Copy link
Member

jlebon commented Mar 6, 2025

We would need something like rpm-ostree rebase ostree-remote-image:fedora:registry:quay.io/fedora/fedora-coreos:stable --digest abcd123...

Right yeah, I think this came up before when we were evaluating the options. It shouldn't be hard to add this to rpm-ostree. I can pick that up.

@jlebon
Copy link
Member

jlebon commented Mar 6, 2025

Opened coreos/rpm-ostree#5325 for this. While I was there, also added support for finalization via OCI digest as discussed in #1241 (comment).

I would also reiterate here while I totally understand the rationale in keeping the upgrade graph in the short and even medium term, I am still skeptical that it should be required in the longer term versus something like containers/bootc#640 for transitions, and using container tags to encode barriers and control upgrades in general. The latter especially is the guidance I'd give to bootc users.

Not against that approach FWIW. The hooks approach has its appeal (though I think a limitation there in that specific bootc proposal is to deal with migration issues related to the container stack itself, right?). Some of my concerns in coreos/rpm-ostree#1882 I think are still relevant.

At the end of the day though, we (at the bootc level) need to be able to have good support for update drivers that want to do fancier things. So FCOS exercising that path is useful too. I think also with coreos/fedora-coreos-tracker#1872, the overhead with the FCOS approach is not terrible for what you get in return (phased rollouts and barriers).

Specifically the phased rollouts part (I guess in general really, but I know this one is of more interest than the graph), I would love to be able to share code there. I'm personally also open to moving away from Zincati long term if it means more sharing and retaining some of those features.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants