-
Notifications
You must be signed in to change notification settings - Fork 26
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
guest/net: New implementation of network setup with SLAAC and own DHC… #111
base: main
Are you sure you want to change the base?
Conversation
Supersedes #64. Test passt static builds, along with RPMs and Debian packages (x86 only, sorry) at https://passt.top/builds/latest/x86_64/ |
ca7ea9c
to
e1eb9df
Compare
Hmm, I can't test this anymore after rebasing to latest upstream. I'm getting one of these two errors ("Failed to create the microVM" about 30% of the times, Failed to execute
With
or:
I reverted a few commits but I can't seem to get this to work anymore. |
Oh, okay, I didn't pull for a while. It works if I do:
...posting another version of that commit with two functions taken out of configure_network() now that I can test things. Those other issues I'm facing, I have no idea where to start debugging them... |
Fixed by #112
Fixed by updating libkrun. |
This is now supported by passt 2024_11_27.c0fbc7e, matching Fedora updates passt-0^20241127.gc0fbc7e-1.fc40, passt-0^20241127.gc0fbc7e-1.fc41, passt-0^20241127.gc0fbc7e-1.fc42, as well as Debian's passt-0.0~git20241127.c0fbc7e-1. |
Thanks a lot @sbrivio-rh , I really like this approach. A couple questions:
|
Ah, right, I thought So, yes, I should also configure
Oops, I didn't check. I'm not exactly the right person as I barely understand what a crate is (do I?) and I don't even use Fedora regularly, but yes, I can probably do that, it should be low effort. I found this abandoned Copr by the way, https://copr.fedorainfracloud.org/coprs/zurdo/i3status-rs-update/package/rust-neli/, it looks pretty easy. Let me give it a try, but if you know somebody else who could be interested... |
Naive question: if it's statically linked, does it really become a dependency? Or is it just a build dependency? Does that matter also if the crate is downloaded as needed...? |
It's just a build dependency. In Fedora, every crate you depend on must be independently packaged, and builds are done offline. Luckily, rust2rpm helps a lot with this. |
Oh, oops, I just had a look at https://src.fedoraproject.org/user/slp/projects... let me package that. :) |
I just added support for nameservers over DHCP (option 6), omitting for the moment:
|
Oops, I just noticed the |
Gosh, the reformatted version looks horrible, with 100 columns that don't fit pretty much anywhere and things wildly misaligned. Is |
Package reviews: |
We had this problem with some of the rust-vmm packages. If the LICENSE file is in the root of the workspace, it doesn't get included in the crate. The trick here is adding the file as a source as done here: https://src.fedoraproject.org/rpms/rust-virtio-queue/blob/rawhide/f/rust-virtio-queue.spec |
That's what I did there (I think?):
...there must be some subtle detail I'm missing. |
Grr, of course. From the rust-virtio-queue spec file:
Thanks @slp for the example. |
Fedora package review request for |
Looks like the merge conflicts have not been resolved... 🙈 |
I just clicked around randomly, I'm anyway not able to test it anymore because of #117 (comment) and a new I'll push fixed conflicts once I can test this again. |
Oh, that's because my usual trick of commenting out On the other hand, if I skip starting it, then everything hangs... |
Try using --sommelier, it might on a headless machine that way (sorry for close/reopen, literally a misclick) |
...thanks, yes, that also works because it reverts the effect of commit On the other hand I guess we should have the same fallback with no options at all (at least as long as muvm is supposed to be a generic tool). |
Looks like the package review got stuck because it depends on packaging |
Thanks for the offer! But actually, this is stuck because I'm waiting for jbaublitz/neli#241. It's not blocking, but I thought it makes sense to wait a bit anyway as the author is quite close to merge it. Once it's merged, I would then spend some "hours of fun" on packaging matters (not just that package) and try to sort all those. List of things (kind of) blocking this at the moment:
|
There is now an "official" way to getting the root shell, by running sth like |
Ah, thanks, I didn't know that. For some reason it hangs for a while, then aborts for me:
with the guest started as On the other hand, it's much less usable for me than something like |
Now superseded by jbaublitz/neli#256 (merged), so I can proceed with everything. I'll continue as soon as I find a couple of hours in a row. |
…P client The existing implementation has a couple of issues: - it doesn't support IPv6 or SLAAC - it relies on either dhclient(8) or dhcpcd(8), which need a significant amount of time to configure the network as they are rather generic DHCP clients - on top of this, dhcpcd, by default, unless --noarp is given, will spend five seconds ARP-probing the address it just received before configuring it Replace the IPv4 part with a minimalistic, 90-line DHCP client that just does what we need, using option 80 (Rapid Commit) to speed up the whole exchange. Add IPv6 support (including IPv4-only, and IPv6-only modes) relying on the kernel to perform SLAAC. Safely avoid DAD (we're the only node on the link) by disabling router solicitations, starting SLAAC, and re-enabling them once addresses are configured. Instead of merely triggering the network setup and proceeding, wait until everything is configured, so that connectivity is guaranteed to be ready before any further process runs in the guest, say: $ ./target/debug/muvm -- ping -c1 2a01:4f8:222:904::2 PING 2a01:4f8:222:904::2 (2a01:4f8:222:904::2) 56 data bytes 64 bytes from 2a01:4f8:222:904::2: icmp_seq=1 ttl=255 time=0.256 ms --- 2a01:4f8:222:904::2 ping statistics --- 1 packets transmitted, 1 received, 0% packet loss, time 0ms rtt min/avg/max/mdev = 0.256/0.256/0.256/0.000 ms The whole procedure now takes approximately 1.5 to 2 ms (for both IPv4 and IPv6), with the DHCP exchange and configuration taking somewhere around 300-500 µs out of that, instead of hundreds of milliseconds to seconds. Configure nameservers received via DHCP option 6 as well: passt already takes care care of translating DNS traffic directed to loopback addresses read from resolv.conf, so we can just write those to resolv.conf in the guest. At least for the moment being, for simplicity, omit handling of option 119 (domain search list), as I doubt it's going to be of much use for muvm. I'm not adding handling of the NDP RDNSS option (25, RFC 8106) either, for the moment, as it involves a second netlink socket subscribing to the RTNLGRP_ND_USEROPT group and listening to events while we receive the first router advertisement. The equivalent userspace tool would be rdnssd(8), which is not called before this change anyway. I would rather add it at a later time instead of making this patch explode. Matching support in passt for option 80 (RFC 4039) and for the DHCP "broadcast" flag (RFC 2131) needs at least passt 2024_11_27.c0fbc7e: https://archives.passt.top/passt-user/20241127142126.3c53066e@elisabeth/ Signed-off-by: Stefano Brivio <[email protected]> Co-authored-by: Teoh Han Hui <[email protected]>
Now it should be all clean again. I also added handling of option 26 (Interface MTU), as passt can conveniently use 65520 bytes (the host kernel does segmentation for TCP anyway). That's one small step for a DHCP client, one giant leap for throughput. I can't fathom for the life of me how you all can live without version logs for patches. |
I sent a new version for (package) review. Now we have an additional problem though:
The updated version, with jbaublitz/neli#182, needs |
Additional review request: |
@sbrivio-rh It's specified by Fedora's Rust packaging guidelines: https://docs.fedoraproject.org/en-US/packaging-guidelines/Rust/#_packaging_multiple_versions |
Sure, that makes sense, I just think that having a dependency manager and in general an ecosystem happily pulling in whatever new crate du jour (that's just Rust and it's all fine per se) along with the requirement that every and each dependency crate is packaged separately (and that's Fedora's policy) poses a significant barrier for contributors (especially occasional ones), and not in a good way, without technical merit or quality benefit whatsoever. Cargo already enforces what's needed. This thing was pretty much ready three months ago. But okay, sure, also |
…P client
The existing implementation has a couple of issues:
it doesn't support IPv6 or SLAAC
it relies on either dhclient(8) or dhcpcd(8), which need a significant amount of time to configure the network as they are rather generic DHCP clients
on top of this, dhcpcd, by default, unless --noarp is given, will spend five seconds ARP-probing the address it just received before configuring it
Replace the IPv4 part with a minimalistic, 73-line DHCP client that just does what we need, using option 80 (Rapid Commit) to speed up the whole exchange.
Add IPv6 support (including IPv4-only, and IPv6-only modes) relying on the kernel to perform SLAAC. Safely avoid DAD (we're the only node on the link) by disabling router solicitations, starting SLAAC, and re-enabling them once addresses are configured.
Instead of merely triggering the network setup and proceeding, wait until everything is configured, so that connectivity is guaranteed to be ready before any further process runs in the guest, say:
$ ./target/debug/muvm -- ping -c1 2a01:4f8:222:904::2
PING 2a01:4f8:222:904::2 (2a01:4f8:222:904::2) 56 data bytes
64 bytes from 2a01:4f8:222:904::2: icmp_seq=1 ttl=255 time=0.256 ms
--- 2a01:4f8:222:904::2 ping statistics ---
1 packets transmitted, 1 received, 0% packet loss, time 0ms
rtt min/avg/max/mdev = 0.256/0.256/0.256/0.000 ms
The whole procedure now takes approximately 1.5 to 2 ms (for both IPv4 and IPv6), with the DHCP exchange and configuration taking somewhere around 300-500 µs out of that, instead of hundreds of milliseconds to seconds.
Matching support in passt for option 80 (RFC 4039) and for the DHCP "broadcast" flag (RFC 2131) needs this series:
https://archives.passt.top/passt-dev/[email protected]/
[I'll update this commit message once we have an upstream release with it]