Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Docker provider e2e test #2188

Open
wants to merge 5 commits into
base: main
Choose a base branch
from

Conversation

stevenhorsman
Copy link
Member

This is based on Wainer's PR #1965, with some work done to make it fit into the new e2e world

@wainersm
Copy link
Member

wainersm commented Dec 5, 2024

Thanks @stevenhorsman !

I ran on my fork but it failed to install caa: https://github.com/wainersm/cc-cloud-api-adaptor/actions/runs/12180636082/job/33975999946

I will add a step to gather logs at the end of the job on fail, as we do for libvirt.

@wainersm
Copy link
Member

wainersm commented Dec 5, 2024

@stevenhorsman pushed two new commits to enable debug on docker workflow.

On https://github.com/wainersm/cc-cloud-api-adaptor/actions/runs/12185682757/job/33993001651, it fail to start caa:

  Usage: cloud-api-adaptor <provider-name> [options] | help | version
  
  Supported cloud providers are:
  	gcp
  	azure
  	aws
  	ibmcloud
  	vsphere

The caa image being used has no support for docker... is the support built in by default?

@stevenhorsman
Copy link
Member Author

So we need to use the dev image to get libvirt support due to libvirt using cgo bindings.

@stevenhorsman
Copy link
Member Author

I think there is also an issue with CAA_IMAGE envs (and properties) just being ignored, so I might switch docker to use the same mechanism as libvirt and do the kustomize rather than install directory.

@stevenhorsman
Copy link
Member Author

@wainersm I've made and pushed the change that should fix the CAA issue and testing it now. I think it can be fixup'd into a previous commit, but as it might be controversial I wanted to give you the chance to see it first

@wainersm
Copy link
Member

wainersm commented Dec 6, 2024

Hi @stevenhorsman !

@wainersm I've made and pushed the change that should fix the CAA issue and testing it now. I think it can be fixup'd into a previous commit, but as it might be controversial I wanted to give you the chance to see it first

I accepted the pre_install stuff makes no sense anymore. Yep, let's remove it completely when we finish the removal of packer workflows. Ah, I think we should leave your last commit as is, the commit message has a nice explanation for the history of this git repository.

Did the test finish? Worked out?

@stevenhorsman
Copy link
Member Author

stevenhorsman commented Dec 6, 2024

Did the test finish? Worked out?

The test failed: https://github.com/stevenhorsman/cloud-api-adaptor/actions/runs/12196066119/job/34023415633 We only seem to get the last ~10 lines of the CAA log which isn't super helpful in this case. I think it might have to be a manual debug, though I tried that a few weeks ago when I first wrote some of this code and didn't get very far. I'm not sure if we still get this problem: https://github.com/confidential-containers/cloud-api-adaptor/tree/main/src/cloud-api-adaptor/docker#troubleshooting that used to always happy on the docker provider?

@wainersm
Copy link
Member

wainersm commented Dec 6, 2024

Did the test finish? Worked out?

The test failed: https://github.com/stevenhorsman/cloud-api-adaptor/actions/runs/12196066119/job/34023415633 We only seem to get the last ~10 lines of the CAA log which isn't super helpful in this case. I think it might have to be a manual debug, though I tried that a few weeks ago when I first wrote some of this code and didn't get very far. I'm not sure if we still get this problem: https://github.com/confidential-containers/cloud-api-adaptor/tree/main/src/cloud-api-adaptor/docker#troubleshooting that used to always happy on the docker provider?

Yeah, I will reserve some time to debug locally. In any case I sent an fixup to print all caa log messages.

For the records, I never made the docker provider to work locally (nor on my tentative to introduce the CI workflow) :(

@stevenhorsman
Copy link
Member Author

For the records, I never made the docker provider to work locally (nor on my tentative to introduce the CI workflow) :(

Yeah, I haven't had it working for a long time. I think Pradipta might have though.

@wainersm
Copy link
Member

I was able to reproduce it locally. Looking at a dangling podvm container, the kata-agent and agent-protocol-forwarder services are failing to start due the lack of their configuration files in /run/peerpod.

Dec 20 22:46:43 e1f29dbb2729 systemd[1]: Starting agent-protocol-forwarder.service - Agent Protocol Forwarder...
Dec 20 22:46:43 e1f29dbb2729 (orwarder)[21686]: agent-protocol-forwarder.service: Referenced but unset environment variable evaluates to an empty string: OPTIONS, TLS_OPTIONS
Dec 20 22:46:43 e1f29dbb2729 agent-protocol-forwarder[21686]: agent-protocol-forwarder version unknown
Dec 20 22:46:43 e1f29dbb2729 agent-protocol-forwarder[21686]:   commit: unknown
Dec 20 22:46:43 e1f29dbb2729 agent-protocol-forwarder[21686]:   go: go1.22.7
Dec 20 22:46:43 e1f29dbb2729 agent-protocol-forwarder[21686]: /usr/local/bin/agent-protocol-forwarder: failed to open /run/peerpod/daemon.json: open /run/peerpod/daemon.json: no such file or directory
Dec 20 22:46:43 e1f29dbb2729 systemd[1]: agent-protocol-forwarder.service: Main process exited, code=exited, status=1/FAILURE
Dec 20 22:46:43 e1f29dbb2729 systemd[1]: agent-protocol-forwarder.service: Failed with result 'exit-code'.
Dec 20 22:46:43 e1f29dbb2729 systemd[1]: Failed to start agent-protocol-forwarder.service - Agent Protocol Forwarder.
Dec 20 16:10:15 e1f29dbb2729 systemd[1]: Starting kata-agent.service - Kata Agent...
Dec 20 16:10:15 e1f29dbb2729 kata-agent[291]: umount: /sys/fs/cgroup/misc: no mount point specified.
Dec 20 16:10:15 e1f29dbb2729 systemd[1]: Started kata-agent.service - Kata Agent.
Dec 20 16:10:15 e1f29dbb2729 kata-agent[293]: thread 'main' panicked at src/main.rs:132:75:
Dec 20 16:10:15 e1f29dbb2729 kata-agent[293]: called `Result::unwrap()` on an `Err` value: AgentConfig from args
Dec 20 16:10:15 e1f29dbb2729 kata-agent[293]: Caused by:
Dec 20 16:10:15 e1f29dbb2729 kata-agent[293]:     0: Failed to read config file /run/peerpod/agent-config.toml
Dec 20 16:10:15 e1f29dbb2729 kata-agent[293]:     1: No such file or directory (os error 2)
Dec 20 16:10:15 e1f29dbb2729 kata-agent[293]: Stack backtrace:
Dec 20 16:10:15 e1f29dbb2729 kata-agent[293]:    0: kata_agent::config::AgentConfig::from_config_file
Dec 20 16:10:15 e1f29dbb2729 kata-agent[293]:    1: spin::once::Once<T,R>::try_call_once_slow
Dec 20 16:10:15 e1f29dbb2729 kata-agent[293]:    2: kata_agent::real_main::{{closure}}
Dec 20 16:10:15 e1f29dbb2729 kata-agent[293]:    3: kata_agent::main
Dec 20 16:10:15 e1f29dbb2729 kata-agent[293]:    4: std::sys_common::backtrace::__rust_begin_short_backtrace
Dec 20 16:10:15 e1f29dbb2729 kata-agent[293]:    5: std::rt::lang_start
Dec 20 16:10:15 e1f29dbb2729 kata-agent[293]: stack backtrace:
Dec 20 16:10:15 e1f29dbb2729 kata-agent[293]:    0:     0x7f0742c9dc7c - std::backtrace_rs::backtrace::libunwind::trace::ha752d28ad8b59776
Dec 20 16:10:15 e1f29dbb2729 kata-agent[293]:                                at /rustc/82e1608dfa6e0b5569232559e3d385fea5a93112/library/std/src/../../backtrace/src/backtrace/libunwind.rs:104:5
Dec 20 16:10:15 e1f29dbb2729 kata-agent[293]:    1:     0x7f0742c9dc7c - std::backtrace_rs::backtrace::trace_unsynchronized::h19085f9ccfa8c39c
Dec 20 16:10:15 e1f29dbb2729 kata-agent[293]:                                at /rustc/82e1608dfa6e0b5569232559e3d385fea5a93112/library/std/src/../../backtrace/src/backtrace/mod.rs:66:5
Dec 20 16:10:15 e1f29dbb2729 kata-agent[293]:    2:     0x7f0742c9dc7c - std::sys_common::backtrace::_print_fmt::h7fa8e07ec2f6097e
Dec 20 16:10:15 e1f29dbb2729 kata-agent[293]:                                at /rustc/82e1608dfa6e0b5569232559e3d385fea5a93112/library/std/src/sys_common/backtrace.rs:67:5
Dec 20 16:10:15 e1f29dbb2729 kata-agent[293]:    3:     0x7f0742c9dc7c - <std::sys_common::backtrace::_print::DisplayBacktrace as core::fmt::Display>::fmt::hf031f84ba50c0cfa
Dec 20 16:10:15 e1f29dbb2729 kata-agent[293]:                                at /rustc/82e1608dfa6e0b5569232559e3d385fea5a93112/library/std/src/sys_common/backtrace.rs:44:22
Dec 20 16:10:15 e1f29dbb2729 kata-agent[293]:    4:     0x7f0741f31750 - core::fmt::rt::Argument::fmt::h41c4ec8113ce6748
Dec 20 16:10:15 e1f29dbb2729 kata-agent[293]:                                at /rustc/82e1608dfa6e0b5569232559e3d385fea5a93112/library/core/src/fmt/rt.rs:142:9
Dec 20 16:10:15 e1f29dbb2729 kata-agent[293]:    5:     0x7f0741f31750 - core::fmt::write::h9d6dcce53fee8498
Dec 20 16:10:15 e1f29dbb2729 kata-agent[293]:                                at /rustc/82e1608dfa6e0b5569232559e3d385fea5a93112/library/core/src/fmt/mod.rs:1120:17
Dec 20 16:10:15 e1f29dbb2729 kata-agent[293]:    6:     0x7f0742c64e32 - std::io::Write::write_fmt::h1592172be29051fa
Dec 20 16:10:15 e1f29dbb2729 kata-agent[293]:                                at /rustc/82e1608dfa6e0b5569232559e3d385fea5a93112/library/std/src/io/mod.rs:1762:15
Dec 20 16:10:15 e1f29dbb2729 kata-agent[293]:    7:     0x7f0742ca0a6e - std::sys_common::backtrace::_print::h077b25e4211c79a3
Dec 20 16:10:15 e1f29dbb2729 kata-agent[293]:                                at /rustc/82e1608dfa6e0b5569232559e3d385fea5a93112/library/std/src/sys_common/backtrace.rs:47:5
Dec 20 16:10:15 e1f29dbb2729 kata-agent[293]:    8:     0x7f0742ca0a6e - std::sys_common::backtrace::print::h0992ac2bfe77cad2
Dec 20 16:10:15 e1f29dbb2729 kata-agent[293]:                                at /rustc/82e1608dfa6e0b5569232559e3d385fea5a93112/library/std/src/sys_common/backtrace.rs:34:9
Dec 20 16:10:15 e1f29dbb2729 kata-agent[293]:    9:     0x7f0742ca0200 - std::panicking::default_hook::{{closure}}::h2cf07aea02dfbfab
Dec 20 16:10:15 e1f29dbb2729 kata-agent[293]:                                at /rustc/82e1608dfa6e0b5569232559e3d385fea5a93112/library/std/src/panicking.rs:272:22
Dec 20 16:10:15 e1f29dbb2729 kata-agent[293]:   10:     0x7f0742ca13ab - std::panicking::default_hook::h85b79716976ec2d8
Dec 20 16:10:15 e1f29dbb2729 kata-agent[293]:                                at /rustc/82e1608dfa6e0b5569232559e3d385fea5a93112/library/std/src/panicking.rs:292:9
Dec 20 16:10:15 e1f29dbb2729 kata-agent[293]:   11:     0x7f0742ca13ab - std::panicking::rust_panic_with_hook::h8687ef8cbe256fb2
Dec 20 16:10:15 e1f29dbb2729 kata-agent[293]:                                at /rustc/82e1608dfa6e0b5569232559e3d385fea5a93112/library/std/src/panicking.rs:779:13
Dec 20 16:10:15 e1f29dbb2729 kata-agent[293]:   12:     0x7f0742ca0dcc - std::panicking::begin_panic_handler::{{closure}}::ha182011c2febc20e
Dec 20 16:10:15 e1f29dbb2729 kata-agent[293]:                                at /rustc/82e1608dfa6e0b5569232559e3d385fea5a93112/library/std/src/panicking.rs:657:13
Dec 20 16:10:15 e1f29dbb2729 kata-agent[293]:   13:     0x7f0742ca0d26 - std::sys_common::backtrace::__rust_end_short_backtrace::h8ea229012037d9c8
Dec 20 16:10:15 e1f29dbb2729 kata-agent[293]:                                at /rustc/82e1608dfa6e0b5569232559e3d385fea5a93112/library/std/src/sys_common/backtrace.rs:170:18
Dec 20 16:10:15 e1f29dbb2729 kata-agent[293]:   14:     0x7f0742ca0d11 - rust_begin_unwind
Dec 20 16:10:15 e1f29dbb2729 kata-agent[293]:                                at /rustc/82e1608dfa6e0b5569232559e3d385fea5a93112/library/std/src/panicking.rs:645:5
Dec 20 16:10:15 e1f29dbb2729 kata-agent[293]:   15:     0x7f0741d4a354 - core::panicking::panic_fmt::h4a1a6d8dbc505935
Dec 20 16:10:15 e1f29dbb2729 kata-agent[293]:                                at /rustc/82e1608dfa6e0b5569232559e3d385fea5a93112/library/core/src/panicking.rs:72:14
Dec 20 16:10:15 e1f29dbb2729 kata-agent[293]:   16:     0x7f0741d4a8c2 - core::result::unwrap_failed::h75343c7b07dcc6a8
Dec 20 16:10:15 e1f29dbb2729 kata-agent[293]:                                at /rustc/82e1608dfa6e0b5569232559e3d385fea5a93112/library/core/src/result.rs:1653:5
Dec 20 16:10:15 e1f29dbb2729 kata-agent[293]:   17:     0x7f0741d5927a - spin::once::Once<T,R>::try_call_once_slow::hefb19772f3a2933f
Dec 20 16:10:15 e1f29dbb2729 kata-agent[293]:   18:     0x7f07422fa961 - kata_agent::real_main::{{closure}}::haa1da8388ed3a9e0
Dec 20 16:10:15 e1f29dbb2729 kata-agent[293]:   19:     0x7f07421a673e - kata_agent::main::hdf8911a3c283ff3d
Dec 20 16:10:15 e1f29dbb2729 kata-agent[293]:   20:     0x7f074220a9e3 - std::sys_common::backtrace::__rust_begin_short_backtrace::h4d8f3f5ab56a1476
Dec 20 16:10:15 e1f29dbb2729 kata-agent[293]:   21:     0x7f074226ecd0 - std::rt::lang_start::h2a2d0f24325dfe2a
Dec 20 16:10:15 e1f29dbb2729 systemd[1]: kata-agent.service: Main process exited, code=exited, status=101/n/a

The only files mounted in /run/peerpod is the policy.rego:

[root@e1f29dbb2729 /]# ls /run/peerpod/
policy.rego

@bpradipt
Copy link
Member

I was able to reproduce it locally. Looking at a dangling podvm container, the kata-agent and agent-protocol-forwarder services are failing to start due the lack of their configuration files in /run/peerpod.

Dec 20 22:46:43 e1f29dbb2729 systemd[1]: Starting agent-protocol-forwarder.service - Agent Protocol Forwarder...
Dec 20 22:46:43 e1f29dbb2729 (orwarder)[21686]: agent-protocol-forwarder.service: Referenced but unset environment variable evaluates to an empty string: OPTIONS, TLS_OPTIONS
Dec 20 22:46:43 e1f29dbb2729 agent-protocol-forwarder[21686]: agent-protocol-forwarder version unknown
Dec 20 22:46:43 e1f29dbb2729 agent-protocol-forwarder[21686]:   commit: unknown
Dec 20 22:46:43 e1f29dbb2729 agent-protocol-forwarder[21686]:   go: go1.22.7
Dec 20 22:46:43 e1f29dbb2729 agent-protocol-forwarder[21686]: /usr/local/bin/agent-protocol-forwarder: failed to open /run/peerpod/daemon.json: open /run/peerpod/daemon.json: no such file or directory
Dec 20 22:46:43 e1f29dbb2729 systemd[1]: agent-protocol-forwarder.service: Main process exited, code=exited, status=1/FAILURE
Dec 20 22:46:43 e1f29dbb2729 systemd[1]: agent-protocol-forwarder.service: Failed with result 'exit-code'.
Dec 20 22:46:43 e1f29dbb2729 systemd[1]: Failed to start agent-protocol-forwarder.service - Agent Protocol Forwarder.
Dec 20 16:10:15 e1f29dbb2729 systemd[1]: Starting kata-agent.service - Kata Agent...
Dec 20 16:10:15 e1f29dbb2729 kata-agent[291]: umount: /sys/fs/cgroup/misc: no mount point specified.
Dec 20 16:10:15 e1f29dbb2729 systemd[1]: Started kata-agent.service - Kata Agent.
Dec 20 16:10:15 e1f29dbb2729 kata-agent[293]: thread 'main' panicked at src/main.rs:132:75:
Dec 20 16:10:15 e1f29dbb2729 kata-agent[293]: called `Result::unwrap()` on an `Err` value: AgentConfig from args
Dec 20 16:10:15 e1f29dbb2729 kata-agent[293]: Caused by:
Dec 20 16:10:15 e1f29dbb2729 kata-agent[293]:     0: Failed to read config file /run/peerpod/agent-config.toml
Dec 20 16:10:15 e1f29dbb2729 kata-agent[293]:     1: No such file or directory (os error 2)
Dec 20 16:10:15 e1f29dbb2729 kata-agent[293]: Stack backtrace:
Dec 20 16:10:15 e1f29dbb2729 kata-agent[293]:    0: kata_agent::config::AgentConfig::from_config_file
Dec 20 16:10:15 e1f29dbb2729 kata-agent[293]:    1: spin::once::Once<T,R>::try_call_once_slow
Dec 20 16:10:15 e1f29dbb2729 kata-agent[293]:    2: kata_agent::real_main::{{closure}}
Dec 20 16:10:15 e1f29dbb2729 kata-agent[293]:    3: kata_agent::main
Dec 20 16:10:15 e1f29dbb2729 kata-agent[293]:    4: std::sys_common::backtrace::__rust_begin_short_backtrace
Dec 20 16:10:15 e1f29dbb2729 kata-agent[293]:    5: std::rt::lang_start
Dec 20 16:10:15 e1f29dbb2729 kata-agent[293]: stack backtrace:
Dec 20 16:10:15 e1f29dbb2729 kata-agent[293]:    0:     0x7f0742c9dc7c - std::backtrace_rs::backtrace::libunwind::trace::ha752d28ad8b59776
Dec 20 16:10:15 e1f29dbb2729 kata-agent[293]:                                at /rustc/82e1608dfa6e0b5569232559e3d385fea5a93112/library/std/src/../../backtrace/src/backtrace/libunwind.rs:104:5
Dec 20 16:10:15 e1f29dbb2729 kata-agent[293]:    1:     0x7f0742c9dc7c - std::backtrace_rs::backtrace::trace_unsynchronized::h19085f9ccfa8c39c
Dec 20 16:10:15 e1f29dbb2729 kata-agent[293]:                                at /rustc/82e1608dfa6e0b5569232559e3d385fea5a93112/library/std/src/../../backtrace/src/backtrace/mod.rs:66:5
Dec 20 16:10:15 e1f29dbb2729 kata-agent[293]:    2:     0x7f0742c9dc7c - std::sys_common::backtrace::_print_fmt::h7fa8e07ec2f6097e
Dec 20 16:10:15 e1f29dbb2729 kata-agent[293]:                                at /rustc/82e1608dfa6e0b5569232559e3d385fea5a93112/library/std/src/sys_common/backtrace.rs:67:5
Dec 20 16:10:15 e1f29dbb2729 kata-agent[293]:    3:     0x7f0742c9dc7c - <std::sys_common::backtrace::_print::DisplayBacktrace as core::fmt::Display>::fmt::hf031f84ba50c0cfa
Dec 20 16:10:15 e1f29dbb2729 kata-agent[293]:                                at /rustc/82e1608dfa6e0b5569232559e3d385fea5a93112/library/std/src/sys_common/backtrace.rs:44:22
Dec 20 16:10:15 e1f29dbb2729 kata-agent[293]:    4:     0x7f0741f31750 - core::fmt::rt::Argument::fmt::h41c4ec8113ce6748
Dec 20 16:10:15 e1f29dbb2729 kata-agent[293]:                                at /rustc/82e1608dfa6e0b5569232559e3d385fea5a93112/library/core/src/fmt/rt.rs:142:9
Dec 20 16:10:15 e1f29dbb2729 kata-agent[293]:    5:     0x7f0741f31750 - core::fmt::write::h9d6dcce53fee8498
Dec 20 16:10:15 e1f29dbb2729 kata-agent[293]:                                at /rustc/82e1608dfa6e0b5569232559e3d385fea5a93112/library/core/src/fmt/mod.rs:1120:17
Dec 20 16:10:15 e1f29dbb2729 kata-agent[293]:    6:     0x7f0742c64e32 - std::io::Write::write_fmt::h1592172be29051fa
Dec 20 16:10:15 e1f29dbb2729 kata-agent[293]:                                at /rustc/82e1608dfa6e0b5569232559e3d385fea5a93112/library/std/src/io/mod.rs:1762:15
Dec 20 16:10:15 e1f29dbb2729 kata-agent[293]:    7:     0x7f0742ca0a6e - std::sys_common::backtrace::_print::h077b25e4211c79a3
Dec 20 16:10:15 e1f29dbb2729 kata-agent[293]:                                at /rustc/82e1608dfa6e0b5569232559e3d385fea5a93112/library/std/src/sys_common/backtrace.rs:47:5
Dec 20 16:10:15 e1f29dbb2729 kata-agent[293]:    8:     0x7f0742ca0a6e - std::sys_common::backtrace::print::h0992ac2bfe77cad2
Dec 20 16:10:15 e1f29dbb2729 kata-agent[293]:                                at /rustc/82e1608dfa6e0b5569232559e3d385fea5a93112/library/std/src/sys_common/backtrace.rs:34:9
Dec 20 16:10:15 e1f29dbb2729 kata-agent[293]:    9:     0x7f0742ca0200 - std::panicking::default_hook::{{closure}}::h2cf07aea02dfbfab
Dec 20 16:10:15 e1f29dbb2729 kata-agent[293]:                                at /rustc/82e1608dfa6e0b5569232559e3d385fea5a93112/library/std/src/panicking.rs:272:22
Dec 20 16:10:15 e1f29dbb2729 kata-agent[293]:   10:     0x7f0742ca13ab - std::panicking::default_hook::h85b79716976ec2d8
Dec 20 16:10:15 e1f29dbb2729 kata-agent[293]:                                at /rustc/82e1608dfa6e0b5569232559e3d385fea5a93112/library/std/src/panicking.rs:292:9
Dec 20 16:10:15 e1f29dbb2729 kata-agent[293]:   11:     0x7f0742ca13ab - std::panicking::rust_panic_with_hook::h8687ef8cbe256fb2
Dec 20 16:10:15 e1f29dbb2729 kata-agent[293]:                                at /rustc/82e1608dfa6e0b5569232559e3d385fea5a93112/library/std/src/panicking.rs:779:13
Dec 20 16:10:15 e1f29dbb2729 kata-agent[293]:   12:     0x7f0742ca0dcc - std::panicking::begin_panic_handler::{{closure}}::ha182011c2febc20e
Dec 20 16:10:15 e1f29dbb2729 kata-agent[293]:                                at /rustc/82e1608dfa6e0b5569232559e3d385fea5a93112/library/std/src/panicking.rs:657:13
Dec 20 16:10:15 e1f29dbb2729 kata-agent[293]:   13:     0x7f0742ca0d26 - std::sys_common::backtrace::__rust_end_short_backtrace::h8ea229012037d9c8
Dec 20 16:10:15 e1f29dbb2729 kata-agent[293]:                                at /rustc/82e1608dfa6e0b5569232559e3d385fea5a93112/library/std/src/sys_common/backtrace.rs:170:18
Dec 20 16:10:15 e1f29dbb2729 kata-agent[293]:   14:     0x7f0742ca0d11 - rust_begin_unwind
Dec 20 16:10:15 e1f29dbb2729 kata-agent[293]:                                at /rustc/82e1608dfa6e0b5569232559e3d385fea5a93112/library/std/src/panicking.rs:645:5
Dec 20 16:10:15 e1f29dbb2729 kata-agent[293]:   15:     0x7f0741d4a354 - core::panicking::panic_fmt::h4a1a6d8dbc505935
Dec 20 16:10:15 e1f29dbb2729 kata-agent[293]:                                at /rustc/82e1608dfa6e0b5569232559e3d385fea5a93112/library/core/src/panicking.rs:72:14
Dec 20 16:10:15 e1f29dbb2729 kata-agent[293]:   16:     0x7f0741d4a8c2 - core::result::unwrap_failed::h75343c7b07dcc6a8
Dec 20 16:10:15 e1f29dbb2729 kata-agent[293]:                                at /rustc/82e1608dfa6e0b5569232559e3d385fea5a93112/library/core/src/result.rs:1653:5
Dec 20 16:10:15 e1f29dbb2729 kata-agent[293]:   17:     0x7f0741d5927a - spin::once::Once<T,R>::try_call_once_slow::hefb19772f3a2933f
Dec 20 16:10:15 e1f29dbb2729 kata-agent[293]:   18:     0x7f07422fa961 - kata_agent::real_main::{{closure}}::haa1da8388ed3a9e0
Dec 20 16:10:15 e1f29dbb2729 kata-agent[293]:   19:     0x7f07421a673e - kata_agent::main::hdf8911a3c283ff3d
Dec 20 16:10:15 e1f29dbb2729 kata-agent[293]:   20:     0x7f074220a9e3 - std::sys_common::backtrace::__rust_begin_short_backtrace::h4d8f3f5ab56a1476
Dec 20 16:10:15 e1f29dbb2729 kata-agent[293]:   21:     0x7f074226ecd0 - std::rt::lang_start::h2a2d0f24325dfe2a
Dec 20 16:10:15 e1f29dbb2729 systemd[1]: kata-agent.service: Main process exited, code=exited, status=101/n/a

The only files mounted in /run/peerpod is the policy.rego:

[root@e1f29dbb2729 /]# ls /run/peerpod/
policy.rego

The daemon.json is created by process_user_data - https://github.com/confidential-containers/cloud-api-adaptor/blob/main/src/cloud-providers/docker/provider.go#L78
May be you don't have the latest container image ?

@bpradipt
Copy link
Member

This may be related if you were using latest code - #2222

@wainersm
Copy link
Member

Hi @bpradipt !

This may be related if you were using latest code - #2222

Yep, I'm using the latest podvm and caa images, ghcr.io/confidential-containers/cloud-api-adaptor/podvm-docker-image-amd64:latest and ghcr.io/confidential-containers/cloud-api-adaptor:latest-amd64-dev

Thanks for this fix! The symptoms matches with what I've seen in my environment, where /run/media/cidata/user-data doesn't show up for process-user-data:

[root@695f9834e68a /]# systemctl status process-user-data
● process-user-data.service - Process user data
     Loaded: loaded (/etc/systemd/system/process-user-data.service; enabled; preset: disabled)
    Drop-In: /usr/lib/systemd/system/service.d
             └─10-timeout-abort.conf
     Active: active (exited) since Mon 2024-12-23 15:35:37 UTC; 6min ago
    Process: 44 ExecStart=/usr/local/bin/process-user-data provision-files (code=exited, status=0/SUCCESS)
   Main PID: 44 (code=exited, status=0/SUCCESS)
        CPU: 32ms

Dec 23 15:35:37 695f9834e68a process-user-data[44]: 2024/12/23 15:35:37 [userdata/provision] unsupported user data provider, we extract and calculate initdata hash only.
Dec 23 15:35:37 695f9834e68a process-user-data[44]: 2024/12/23 15:35:37 [userdata/provision] File /run/peerpod/initdata not found, skipped initdata processing.
Notice: journal has been rotated since unit was started, output may be incomplete

I will give the fix a try.

@wainersm
Copy link
Member

wainersm commented Jan 6, 2025

Last week I applied #2222 on top of this PR to check the fix, but I realized the built caa image had the old (from main branch) code. I will find some time to audit this workflow again to find the reason.

@wainersm wainersm force-pushed the docker-provider-e2e-test branch from 919bdda to 98fc450 Compare January 13, 2025 15:01
@wainersm
Copy link
Member

Rebased the branch with main to pick up the latest fix to docker: #2222

@bpradipt
Copy link
Member

Rebased the branch with main to pick up the latest fix to docker: #2222

@wainersm I was thinking whether we should default to crio for docker e2e as with crio we don't hit the nydus snapshotter issues. What are your thoughts ?

@wainersm
Copy link
Member

Rebased the branch with main to pick up the latest fix to docker: #2222

@wainersm I was thinking whether we should default to crio for docker e2e as with crio we don't hit the nydus snapshotter issues. What are your thoughts ?

That's a good idea @bpradipt !

I only need to remember how to switch to crio on Kind. ;)

@bpradipt
Copy link
Member

Rebased the branch with main to pick up the latest fix to docker: #2222

@wainersm I was thinking whether we should default to crio for docker e2e as with crio we don't hit the nydus snapshotter issues. What are your thoughts ?

That's a good idea @bpradipt !

I only need to remember how to switch to crio on Kind. ;)

;) .. Let me see if I can help..
You can set CONTAINER_RUNTIME=crio
https://github.com/confidential-containers/cloud-api-adaptor/blob/main/src/cloud-api-adaptor/docker/kind_cluster.sh#L32
https://github.com/confidential-containers/cloud-api-adaptor/blob/main/src/cloud-api-adaptor/docker/README.md#test-execution

Create a common debugger script (./hack/ci-e2e-debug-fail.sh) that
should be called by workflows in case of failure, to help on
debugging activities.

By switching to a common script we avoid the problem of testing on
pull_request_target triggered workflows. Also reduce the amount of
duplicated code.

Signed-off-by: Wainer dos Santos Moschetta <[email protected]>
@wainersm wainersm force-pushed the docker-provider-e2e-test branch from 98fc450 to 9880a8e Compare January 20, 2025 19:16
@wainersm
Copy link
Member

@bpradipt @stevenhorsman let me give a status on this. Updated to use CRI-O, applied some fixups, and rebased.

I ran in my fork, but the job failed due a behavior I never hit before: The podvm_image input is passed empty (https://github.com/confidential-containers/cloud-api-adaptor/pull/2188/files#diff-17dffd6a19ac160362ba3adea5954e898e06afb2b78e13616cfaf906d2b697b1R311). I see this behavior not just for the docker but also for the libvirt provider, however, the same parameter for libvirt is non-empty on the daily e2e CI. Maybe I'm hitting a limitation/bug when dealing with workflows in forks...don't know...still trying to understand.

The job in question:

https://github.com/wainersm/cc-cloud-api-adaptor/actions/runs/12873694770/job/35892242906

@wainersm
Copy link
Member

I'm afraid I won't be able to fully test on my fork. As per following screenshot, the outputs of the mkosi workflows aren't passed along, likely it's a security measurement to avoid leaking secrets:

Screenshot from 2025-01-20 18-58-14

@stevenhorsman
Copy link
Member Author

@stevenhorsman stevenhorsman force-pushed the docker-provider-e2e-test branch from 9880a8e to 1f3bfe8 Compare January 22, 2025 15:03
Add a callable workflow that run the e2e tests for the docker provider. This
workflow is similar to e2e_libvirt.yaml.

Signed-off-by: Wainer dos Santos Moschetta <[email protected]>
Signed-off-by: stevenhorsman <[email protected]>
This will make the e2e tests for docker to run.

Notice that's set continue-on-error so that the e2e_run_all workflow
exit status won't change, i.e. any failure on e2e_docker is disregarded.

Signed-off-by: Wainer dos Santos Moschetta <[email protected]>
@stevenhorsman stevenhorsman force-pushed the docker-provider-e2e-test branch from 1f3bfe8 to 76ebd31 Compare January 22, 2025 15:09
@stevenhorsman stevenhorsman force-pushed the docker-provider-e2e-test branch from 76ebd31 to 25e3bee Compare January 22, 2025 15:48
stevenhorsman and others added 2 commits January 22, 2025 17:17
As discussed in confidential-containers#2171 the CAA_IMAGE envs are not working in the e2e code
and combined with the installation directory, it seems seems to add confusion
when we need different CAA images for decoupling of different architectures,
so switch docker to use the same approach as libvirt for consistency.

Signed-off-by: stevenhorsman <[email protected]>
The e2e tests for docker on Kind and Containerd has failed due the bug
of nydus-snapshotter, the well-known problem of image layers not being
found at host. That issues doesn't affect CRI-O, so let's switch to that
container runtime for running the tests on.

Signed-off-by: Wainer dos Santos Moschetta <[email protected]>
Signed-off-by: stevenhorsman <[email protected]>
@stevenhorsman stevenhorsman force-pushed the docker-provider-e2e-test branch from 25e3bee to 483a246 Compare January 22, 2025 17:17
@stevenhorsman
Copy link
Member Author

I've run the latest version of this in my fork and the docker tests are passing: https://github.com/stevenhorsman/cloud-api-adaptor/actions/runs/12911903250/job/36006077570, so I think this is ready for review

@stevenhorsman stevenhorsman marked this pull request as ready for review January 22, 2025 17:18
@stevenhorsman stevenhorsman requested a review from a team as a code owner January 22, 2025 17:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants