-
Notifications
You must be signed in to change notification settings - Fork 94
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Docker provider e2e test #2188
base: main
Are you sure you want to change the base?
Docker provider e2e test #2188
Conversation
Thanks @stevenhorsman ! I ran on my fork but it failed to install caa: https://github.com/wainersm/cc-cloud-api-adaptor/actions/runs/12180636082/job/33975999946 I will add a step to gather logs at the end of the job on fail, as we do for libvirt. |
@stevenhorsman pushed two new commits to enable debug on docker workflow. On https://github.com/wainersm/cc-cloud-api-adaptor/actions/runs/12185682757/job/33993001651, it fail to start caa:
The caa image being used has no support for docker... is the support built in by default? |
So we need to use the dev image to get libvirt support due to libvirt using cgo bindings. |
I think there is also an issue with |
@wainersm I've made and pushed the change that should fix the CAA issue and testing it now. I think it can be fixup'd into a previous commit, but as it might be controversial I wanted to give you the chance to see it first |
Hi @stevenhorsman !
I accepted the pre_install stuff makes no sense anymore. Yep, let's remove it completely when we finish the removal of packer workflows. Ah, I think we should leave your last commit as is, the commit message has a nice explanation for the history of this git repository. Did the test finish? Worked out? |
The test failed: https://github.com/stevenhorsman/cloud-api-adaptor/actions/runs/12196066119/job/34023415633 We only seem to get the last ~10 lines of the CAA log which isn't super helpful in this case. I think it might have to be a manual debug, though I tried that a few weeks ago when I first wrote some of this code and didn't get very far. I'm not sure if we still get this problem: https://github.com/confidential-containers/cloud-api-adaptor/tree/main/src/cloud-api-adaptor/docker#troubleshooting that used to always happy on the docker provider? |
Yeah, I will reserve some time to debug locally. In any case I sent an fixup to print all caa log messages. For the records, I never made the docker provider to work locally (nor on my tentative to introduce the CI workflow) :( |
Yeah, I haven't had it working for a long time. I think Pradipta might have though. |
I was able to reproduce it locally. Looking at a dangling podvm container, the
The only files mounted in
|
The daemon.json is created by process_user_data - https://github.com/confidential-containers/cloud-api-adaptor/blob/main/src/cloud-providers/docker/provider.go#L78 |
This may be related if you were using latest code - #2222 |
Hi @bpradipt !
Yep, I'm using the latest podvm and caa images, Thanks for this fix! The symptoms matches with what I've seen in my environment, where
I will give the fix a try. |
Last week I applied #2222 on top of this PR to check the fix, but I realized the built caa image had the old (from main branch) code. I will find some time to audit this workflow again to find the reason. |
919bdda
to
98fc450
Compare
Rebased the branch with main to pick up the latest fix to docker: #2222 |
That's a good idea @bpradipt ! I only need to remember how to switch to crio on Kind. ;) |
;) .. Let me see if I can help.. |
Create a common debugger script (./hack/ci-e2e-debug-fail.sh) that should be called by workflows in case of failure, to help on debugging activities. By switching to a common script we avoid the problem of testing on pull_request_target triggered workflows. Also reduce the amount of duplicated code. Signed-off-by: Wainer dos Santos Moschetta <[email protected]>
98fc450
to
9880a8e
Compare
@bpradipt @stevenhorsman let me give a status on this. Updated to use CRI-O, applied some fixups, and rebased. I ran in my fork, but the job failed due a behavior I never hit before: The The job in question: https://github.com/wainersm/cc-cloud-api-adaptor/actions/runs/12873694770/job/35892242906 |
The output of my fork run: https://github.com/stevenhorsman/cloud-api-adaptor/actions/runs/12909361757/job/35997414640 |
9880a8e
to
1f3bfe8
Compare
Add a callable workflow that run the e2e tests for the docker provider. This workflow is similar to e2e_libvirt.yaml. Signed-off-by: Wainer dos Santos Moschetta <[email protected]> Signed-off-by: stevenhorsman <[email protected]>
This will make the e2e tests for docker to run. Notice that's set continue-on-error so that the e2e_run_all workflow exit status won't change, i.e. any failure on e2e_docker is disregarded. Signed-off-by: Wainer dos Santos Moschetta <[email protected]>
1f3bfe8
to
76ebd31
Compare
76ebd31
to
25e3bee
Compare
As discussed in confidential-containers#2171 the CAA_IMAGE envs are not working in the e2e code and combined with the installation directory, it seems seems to add confusion when we need different CAA images for decoupling of different architectures, so switch docker to use the same approach as libvirt for consistency. Signed-off-by: stevenhorsman <[email protected]>
The e2e tests for docker on Kind and Containerd has failed due the bug of nydus-snapshotter, the well-known problem of image layers not being found at host. That issues doesn't affect CRI-O, so let's switch to that container runtime for running the tests on. Signed-off-by: Wainer dos Santos Moschetta <[email protected]> Signed-off-by: stevenhorsman <[email protected]>
25e3bee
to
483a246
Compare
I've run the latest version of this in my fork and the docker tests are passing: https://github.com/stevenhorsman/cloud-api-adaptor/actions/runs/12911903250/job/36006077570, so I think this is ready for review |
This is based on Wainer's PR #1965, with some work done to make it fit into the new e2e world