Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Using stage0 to reflect our Trusted Computing Base? #5049

Open
CookieComputing opened this issue Feb 11, 2025 · 3 comments
Open

Using stage0 to reflect our Trusted Computing Base? #5049

CookieComputing opened this issue Feb 11, 2025 · 3 comments

Comments

@CookieComputing
Copy link

Hi folks,

Thanks for all the support in #5048! We've been playing around with the stage0 bootloader and it seems to be working for a variety of our use-cases internally, and we're hoping to explore leveraging from some real workloads. That being said, we've encountered some interesting changes to our QEMU launch parameters. For one, it seems like the stage0 bootloader does not support things like kernel-hashes=on, which is a feature that seems to come from the tight coupling between QEMU + OVMF. IIRC, these are used in the "conventional" manner to have the kernel, initrd, and cmdline reflected in the launch digest and therefore attested. We found that when we use QEMU to launch a CVM with the stage0 bootloader and having kernel-hashes=on, we would get the following error:

qemu-system-x86_64: SEV: kernel specified but guest firmware has no hashes table GUID

Which is probably expected, since there probably isn't a GUID that matches this in the stage0 bootloader. This isn't necessarily a deal-breaker, but I need to explore this project more to understand how this is captured using the stage0 stuff.

Having read the overall remote attestation strategy, it's pretty clear that the bootloader takes on an opinionated attestation strategy leveraging DICE attestation, and from the code, I can see that these measurements are reflected in Stage0Measurements and these are generated during stage0 and then used as some stage0 event. It then looks like you folks store this as some E820 table entry. This looks good and I imagine we could use this as part of our stack as well, but I was wondering how you folks "predicted" these measurements at launch time, and how these are connected to the SNP report.

Currently, with the QEMU + OVMF stack, we are able to use the virTEE sev-snp-measure tool to correctly generate a launch digest from a set of components that were built prior to launch. We've been using this as our way of matching artifacts to corresponding launch digests from SNP attestation reports. This, however, doesn't work with the existing stage0 bootloader, as the GUID table does not exist:

$ /tmp/sev-snp-measure.par  --mode snp --ovmf /tmp/stage0_bin  --vcpus 4 --vcpu-type EPYC-v4 --kernel /cvm/launch/cvm_vmlinuz --initrd /cvm/lau
nch/layer.cpio.gz 
Error: Kernel specified but OVMF metadata doesn't include SNP_KERNEL_HASHES section

My understanding is that the stage0 binary itself is measured using snp_measurement, while the stage1 kernel measurements are captured via this utility, but it's not clear to me how these "predicted" measurements are reflected in the SNP attestation report. Our existing stack can generally assume that this is mapped to the MEASUREMENT (or launch digest) field in the SNP attestation report by running the virTEE tool, but I wanted to confirm if this was also the case for stage0, and if you could show me how I could generate such predictive measurements for my own workloads!

The other thing I wanted to understand was how this was mapped in your BT solution. Looking at some example rekor log that was posted on this Github, I was surprised that I didn't see hashes covering all of the other components that may be used to launch the CVM (like kernel, initrd, cmdline, etc.). Are these reflected somewhere in the rekor log that I missed?

rekorlog.txt

Thanks for your support in advance!

@CookieComputing
Copy link
Author

Another thing to add: Since we're also interested in launching our CVMs in a similar fashion to your Oak Containers (i.e. with a fully-fledged Linux Kernel), our set-up is more or less a standard QEMU + OVMF setup you might find in an OSS environment. We've already gotten a boot attestation process set up via dm-verity and initrd, but I'm wondering if you have any suggestions on how best to reflect our own TCB while being as minimally invasive to our setup as possible. If you have any questions about it, I'm happy to answer!

@conradgrobler
Copy link
Collaborator

Your analysis is correct. Stage 0 doesn't support the kernel hashes mechanism.

Only the identity of Stage 0 is reflected in the AMD attestation report (the output of using the snp_measurement tool on the Stage 0 binary).

The rest of the identities (kernel, initrd, kernel command-line etc) are refelected only in the DICE chain and the associated event log. The output of the oak_kernel_measurement tool is reflected in the Stage0Measurement event log entry (kernel_measurement and setup_data_digest in https://github.com/project-oak/oak/blob/main/proto/attestation/eventlog.proto#L29).

The overall identity of the workload is captured in the entire attestation evidence protocol buffer (https://github.com/project-oak/oak/blob/main/proto/attestation/evidence.proto#L110), which includes the attestation report (in root_layer), the event log entries (in event_log) and the DICE certificate chain (in layers). All of that must be checked together to be sure of the workload identity. Our implementation of this logic is in https://github.com/project-oak/oak/blob/main/oak_attestation_verification/src/verifier.rs#L101.

All of the endorsements for all of the components are published to the rekor log. We use a different public key for each different component. An example of the different public keys used for each of the Oak Containers components can be found in https://github.com/project-oak/oak/blob/main/oak_attestation_verification/testdata/oc_reference_values_20241205.textproto

Stage 0 is designed to work specifically with event logs and DICE. This means that it only measure the next boot stage. The initial process in the initrd is responsible for measuring the next phase etc. That approach does not seem compatible with your workload, so it seems like using QEMU + OVMF + dm-verity is the most appropriate approach for your use case.

Another option you could consider is to use a vTPM running in and SVSM (e.g. https://github.com/coconut-svsm/svsm) as described in https://lpc.events/event/17/contributions/1630/attachments/1185/2592/2023-lpc-claudio-vtpm-v2.pdf

@CookieComputing
Copy link
Author

Thanks! I suspected as much given how well integrated DICE is in the stage0 bootloader. That being said, I'm thinking about some various hacks to integrate with the stage0 bootloader, to get the ball rolling here.

That approach does not seem compatible with your workload, so it seems like using QEMU + OVMF + dm-verity is the most appropriate approach for your use case.

We plan on eventually migrating to the DICE attestation stack, but given some other commitments and constraints going on, we might end up hacking in some kernel-hash support for stage0 to get some integration with our current stack. It won't look pretty, but at least it'll do for some of our use-cases

Another option you could consider is to use a vTPM running in and SVSM (e.g. https://github.com/coconut-svsm/svsm) as described in https://lpc.events/event/17/contributions/1630/attachments/1185/2592/2023-lpc-claudio-vtpm-v2.pdf

We're actually very interested in the SVSM project! However, given its researchy stage right now and the fact that it's not quite yet integrated with the upstream kernel, we're currently exploring some other options, but there is a strong likelihood we end up leveraging SVSM in the future.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants