Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Kind provider #1232

Open
wants to merge 12 commits into
base: main
Choose a base branch
from
Open

Conversation

aerosouund
Copy link
Member

What this PR does / why we need it:

Adds capabilities to run kind clusters in the gocli

Which issue(s) this PR fixes (optional, in fixes #<issue number>(, fixes #<issue_number>, ...) format, will close the issue(s) when PR gets merged):
Fixes #

Special notes for your reviewer:

This PR is opened as a draft because its based on the Opts PR which may receive changes and has not been merged yet. it also doesn't include code that integrates this capability with cluster-up. as this is yet left for discussion

Checklist

This checklist is not enforcing, but it's a reminder of items that could be relevant to every PR.
Approvers are expected to review this list.

Adds capabilities to run kind clusters in the gocli

@kubevirt-bot kubevirt-bot added do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. dco-signoff: yes Indicates the PR's author has DCO signed all their commits. needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. labels Jul 28, 2024
@kubevirt-bot
Copy link
Contributor

Hi @aerosouund. Thanks for your PR.

PRs from untrusted users cannot be marked as trusted with /ok-to-test in this repo meaning untrusted PR authors can never trigger tests themselves. Collaborators can still trigger tests on the PR using /test all.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@kubevirt-bot kubevirt-bot requested review from dhiller and oshoval July 28, 2024 10:43
@kubevirt-bot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please assign rmohr for approval. For more information see the Kubernetes Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

The ContainerClient is an abstraction to facilitate using a container runtime on the host regardless of what it is.
The CreateOpts type is a holder of the container run options required in kubevirt ci.

Signed-off-by: aerosouund <[email protected]>
…nd extend it

The docker adapter was previously used to execute commands when ssh.sh was still used.
Its removed and the adapter is changed to be a general interface for container interaction.
Implement the full ssh client interface on the adapter.

Signed-off-by: aerosouund <[email protected]>
The implementation is a wrapper around the docker bash commands to avoid compatiblity issues across different runtime libraries.

Signed-off-by: aerosouund <[email protected]>
the PodmanSSHClient is the podman equivalent of the DockerAdapter type for dealing with podman containers through the ssh client interface

Signed-off-by: aerosouund <[email protected]>
@aerosouund aerosouund force-pushed the gocli-kind-provider branch from 4bbafff to 2c07073 Compare August 24, 2024 12:18
@kubevirt-bot kubevirt-bot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Aug 24, 2024
@aerosouund aerosouund marked this pull request as ready for review August 24, 2024 12:19
@kubevirt-bot kubevirt-bot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Aug 24, 2024
@aerosouund
Copy link
Member Author

Provides the necessary functionality for running multus on sr-iov clusters

Signed-off-by: aerosouund <[email protected]>
Provide functionality to run core k8s constructs on sr-iov clusters

Signed-off-by: aerosouund <[email protected]>
Remounting sysfs as readwrite.
K8s networking configuration.
Downloading and executing registry proxy scripts.

Signed-off-by: aerosouund <[email protected]>
The base provider is a valid provider that can be run standalone for a stock kind cluster, And is also the base for
any other type of kind cluster.
It is container runtime agnostic and depends on the sigs.k8s.io/kind/pkg/cluster package which is introduced in this commit as a dependency.
It uses this package to provision a kind cluster and provides extra functionality by running a registry container and configuring the nodes
to reach it.

Signed-off-by: aerosouund <[email protected]>
The SR-IOV leverages the base provider for core functionality then installs sr-iov capability in the containers afterwards

Signed-off-by: aerosouund <[email protected]>
The vgpu provider leverages the base provider for its core, then checks if the host has any available vgpus.

Signed-off-by: aerosouund <[email protected]>
The new command will parse flags into a KindConfig and run the cluster with it.

Signed-off-by: aerosouund <[email protected]>
Delete old bash code and up clusters using the gocli instead.

Signed-off-by: aerosouund <[email protected]>
@aerosouund aerosouund force-pushed the gocli-kind-provider branch from 2c07073 to 3ccedec Compare August 25, 2024 09:48
Copy link
Member

@EdDev EdDev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

An impressive work has been done here.

The SR-IOV lane is effected by this change.
Unfortunately, the sig-network which have been maintaining in practice this area has no resources to invest into kubevirtci at this time.

Therefore, I would suggest to avoid modifying the kind and SR-IOV portions so we can continue supporting it.

Said that, if someone else will take over and provide the support and maintain the new changes, I''ll be be good with it.

@aerosouund
Copy link
Member Author

@EdDev
Thanks a lot and that's understandable due to the size of the change.

I would like to provide support for this code myself and do have plans for long term maintenance of it under supervision of one of the kubevirt CI approvers

@oshoval
Copy link
Contributor

oshoval commented Sep 11, 2024

This will also create dependencies between vm based providers and kind providers (via gocli)
eliminating benefits of provision manager, and the fact there will be dependency between the providers
makes it impossible to advance with one while the other have problems as happened sometimes
(and provision manager allowed solving it, hence saving developers time + saving CI time)

@aerosouund
Copy link
Member Author

@oshoval
I see your concerns, however I personally don't see them overlapping in terms of dependencies
while yes, they both will be a part of the gocli, the only thing they share in terms of the dependencies is the k8s package and the ssh package, which were designed to be agnostic to what type of cluster you are using.

makes it impossible to advance with one while the other have problems as happened sometimes

very reasonable concern, can you give me an example of when this might arise ?

eliminating benefits of provision manager

I haven't looked at the provision manager a lot as it didn't seem to be affected by what I am doing, but I'd appreciate you highlighting how it relates to what you are describing. what does it add and how does a change like that cancel its benefits ?

@aerosouund
Copy link
Member Author

@oshoval

In general I think this code will still need to go through some reviews before a serious discussion is had about whether or no to actually use it
If you have the time I would highly appreciate if you check it out and see its inner workings and if you see ways of how to make it better

@oshoval
Copy link
Contributor

oshoval commented Sep 11, 2024

@oshoval

In general I think this code will still need to go through some reviews before a serious discussion is had about whether or no to actually use it If you have the time I would highly appreciate if you check it out and see its inner workings and if you see ways of how to make it better

Sorry, i wouldn't able to, i am committed to some urgent tasks

@oshoval
Copy link
Contributor

oshoval commented Sep 11, 2024

@oshoval I see your concerns, however I personally don't see them overlapping in terms of dependencies while yes, they both will be a part of the gocli, the only thing they share in terms of the dependencies is the k8s package and the ssh package, which were designed to be agnostic to what type of cluster you are using.

makes it impossible to advance with one while the other have problems as happened sometimes

very reasonable concern, can you give me an example of when this might arise ?

We had it not once, when one vm based version was broken, and it affected all the other vm based versions / kind
because they either merged or not together
provision manager improved this by saying what files affect which provider, allowing PRs to test just the required changes, and also not provision where not needed.
As even the provision itself consume resources, bump packages and can introduce bugs.

eliminating benefits of provision manager

I haven't looked at the provision manager a lot as it didn't seem to be affected by what I am doing, but I'd appreciate you highlighting how it relates to what you are describing. what does it add and how does a change like that cancel its benefits ?

once gocli affect kind providers, and becomes a bigger library that includes most of the core code,
provision manager wont able to declare that gocli doesnt affect anymore kind based providers.

@aerosouund
Copy link
Member Author

aerosouund commented Sep 11, 2024

once gocli affect kind providers, and becomes a bigger library that includes most of the core code,
provision manager wont able to declare that gocli doesn't affect anymore kind based providers.

After checking out the provision-manager command i have a take on this
to rephrase what you are saying, I see that the purpose of pman is to determine if a change to the repo requires the VM providers to be built again
and it's currently configured to rebuild all providers on any change to the gocli. so by making kind in the gocli we introduce the problem of having to rebuild VMs even if we are changing kind logic

understandable, and this problem can be tackled by having more granular rules

The logic is not entirely shared. As i say there's already a limited amount of shared code between the two in the gocli. so the pman rules file can be changed to exclude directories that are known to be pure kind logic
the code structure also can be altered to have an even higher degree of separation and make management structure much more clear

While this sounds like more complexity on the surface but the benefit accrued from having more navigable, testable and extensible kind code is worth considering

I also see this to be lesser complexity than the current state of kind. in terms of it being a series of source somescript.sh calls

@kubevirt-bot kubevirt-bot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Oct 1, 2024
@kubevirt-bot
Copy link
Contributor

PR needs rebase.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@kubevirt-bot
Copy link
Contributor

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

/lifecycle stale

@kubevirt-bot kubevirt-bot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Dec 30, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
dco-signoff: yes Indicates the PR's author has DCO signed all their commits. lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. size/XXL
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants