-
Notifications
You must be signed in to change notification settings - Fork 1.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix: block incompatible kernel from being installed #16139
Conversation
From my recollection of the last time this was discussed, since the kernel packages are considered special/required, this just results in dnf resolving the conflict by uninstalling the ZFS packages. |
Interesting, I must have missed that discussion. I tried to search and read-up on this issue :( |
Is #15188 what you are referring to? I think it is useful to spell-out what is the expected way to consume zfs-dkms on ""rolling"" distros where kernel upgrades to incompatible versions on a routine bases. Below is what I gathered reading various issues. The current situation:
The ideal in my opinion:
My understanding is with conflict and setting zfs as protected, it achieves this goal with the caveat that dnf will print a warning when upgrading. is that correct? |
My (personal, not any kind of larger group or org) general advice is not to use it on a rolling distro that bumps kernel revs, really, precisely because it's going to break like this. I don't actually know if that was the discussion, it's come up a few times, but my understanding is that there's not really a good way to express what's wanted in I guess you might be able to do something weird like a postscript or conf file that adds a hold or analogue on the kernel range that's acceptable, maybe? |
I'm fairly sure protected is the way to handle it, here's from dnf docs:
I do agree with the points made in the other issue thread though. I don't think setting zfs protected should be done automatically by the package install script if only for the reason that it blocks user uninstall also. So I think a sentence should be added to the fedora install guide to tell user why they need to set this. |
I don't really agree with setting it either, though, is the thing, I think. That's just going to provoke lots of angry messages about how it broke their upgrade process entirely, and half of them followed by they undid it and it broke, IMO. That's not really a better experience, because it's more frustration, and still broken at the end with what they're likely to do without looking up what they should do. Much as I like to encourage people to read documentation, the reality is that people aren't going to go check it if something breaks, they're going to do what seems to make sense to them to get past this complaint, and that's going to break them the same way as before, but with even higher frustration b/c they've spent more time on it. |
I'm not sure that is the case? Currently the situation is broken for both people that follow instructions and people that do not, and with this change it makes it so people that follow the instructions can have a better experience and people that don't is not worse off anyway. right? |
My argument is that people who follow instructions, it will still be broken the same way, because the ideal outcome is that it upgrades everything else possible but stops the kernel from upgrading past the allowed version, right? If you mark the ZFS packages as protected, if I understand the behavior right, it'll refuse to upgrade if there would be a conflict that would require removing one of (kernel|zfs), and if they unprotect it, it'll break just like if they had never done that dance, yes? (To say nothing of if you unprotect the kernel packages, which I'm not sure it'll even allow you to do...) And if you don't, it'll break the same way it currently does. So my claim is based on the idea that "break all upgrades" is not a net gain, because people are just going to end up at the same end state, plus some frustration to undo the thing that stopped them doing the upgrade in the first place, and there's not really any outcome they can have from that state other than "wait for a newer version to make that not a problem" that's better. If I'm misunderstanding something here, or you disagree with my conclusions, I'd love to hear how I'm wrong or not viewing this the same way - I don't have a particularly strong opinion on this, just that if I understand the situation right, this would just break the "rolling" in "rolling upgrades" entirely if the situation comes up, which I claim is just going to result in more frustration and people still breaking their systems, absent a better solution. |
Do you know how I can download the build artifact from buildbot? It might be valuable to have some practical examples. |
The current "Requires" lines only ensure the old kernel is available on the system but it does not prevent fedora from updating to an incompatible and breaking user's system. Set Conflicts to block incompatible kernels from being installed. Signed-off-by: tleydxdy <[email protected]>
rebasing this since my system is broken again by autoupdate, I really don't see the argument against setting the "conflict" field. If the user does not protect zfs:
If the user does set zfs to protected:
In every combination, the conflict setting results in a safer and more up-to-date result. Also the openzfs docs is quite misleading by suggesting that protect zfs will block kernel upgrade, this is not true, it will only block the old kernel to get rotated out |
I tend to agree, adding "conflicts" seems to me like an improvement across the board for all the scenarios. In the unprotected w/conflicts case having the zfs packages be removed may be confusing, but it does have the advantage that I'm good with making this change. We'll just want to keep an eye out for any new corner cases or unexpected side effects. |
I use zfs on root on Fedora. Ive found the best way to deal with this situation is to just continue to boot the last zfs-supported kernel until the next zfs version is released. This gives the situation where zfs keeps working bug Note that dnf will never, under any circumstance, attempt to uninstall the currently running kernel. So, for example, right now I am booting a 6.10.x kernel, but have 2 6.11.x kernels installed. When dnf installs a kernel update it will uninstall the older of the two 6.11.x. It wont try to uninstall the 6.10.x kernel since it is running, and it wont try to uninstall zfs since there is still a compatible (6.10.x) kernel installed. I feel like this is, by far, the best way to deal with the situation - nothing breaks, nothing uninstalled, and you can continue to use Perhaps an alternate solution would be to add a check in the makefile for the kernel module so that dkms builds fail on unsupported kernels and give a obvious "impossible to miss" error message that instructs them to boot ______ kernel to be able to use zfs? This prevents accidental unknowing corruption from using an unsupported kernel (zfs is useless without its kmod), gives them a good path to "wait it out" on an older kernel, and if they choose to not take that path (and not have zfs at all) for 3 kernel updates in a row then having dnf auto-uninstall zfs seems reasonable. |
There's a third "gray zone" case that actually comes up fairly often on Fedora. It happens when we do a release supporting the X kernel, but it just so happens that we've pulled in enough patches into the release to also build under the X+1 kernel as well. For example, zfs-2.2.6 supports the 6.10 kernel, but I believe it will also build on 6.11. It's just that we didn't explicitly test it on the 6.11 kernel at the time of release. I think this is actually the status quo with zfs-2.2.6 Fedora 40 right now. It buys us wiggle room when the OS is aggressively updating it's kernel and we don't want ZFS to artifically short-circuit the build, only because the kernel is untested with ZFS. There has only been one time that I can think of where ZFS built correctly for a newer "unsupported" kernel, but didn't work correctly with it. We would lose this "grey zone" behavior if we pulled in this PR. We would spend more time updating the META file and putting out new releases, even if the code itself hasn't changed. So it's a tradeoff - you either keep the "gray zone" behavior and have longer-lived releases on Fedora (which will make users less annoyed), or do the correct thing and pull in this PR to uninstall ZFS on newer, "unsupported" kernels which ZFS would most likely work perfectly fine on. |
I can confirm that on Fedora 40 the zfs kmod for 6.11 builds wiithout error and seemingly works properly (though I havent tested it beyond being able to boot into my system, which uses zfs on root). Although things seem to work ok on kernel 6.11 with zfs 2.6, I noticed that during boot I get a few warnings + call trace printed to screen that are related to zfs. I almost always get these whenever Im using a kernel that is 1 step higher than the max zfs supports (i.e., this isnt unique to zfs 2.6 + kernel 6.11). I'll add the warnings / call traces I see with 6.11 + zfs 2.6 in a file attached to this comment (to avoid needlessly adding 240 lines to this conversation). Im not sure if this indicates an actual problem (perhaps a very minor one), or if this is just a superfluous warning that can be ignored. It makes me nervous enough though to boot into the older still-supported 6.10 kernel while I wait for an update to bring offical support. |
I can see the argument of not having the conflict field for development package/version, since otherwise it would be pretty annoying to test new kernels. but I don't think having the released version support untested kernels is a good idea. |
Just to put it in context, Fedora 40 alone has run the 6.8, 6.9, 6.10, and 6.11 kernel, and it's not EOL yet:
Now imagine |
As things currently stand we're already going to lose this behavior in the 2.3.0 releases. Commit 410287f, which is not in 2.2.6, updates the build system to enforce the maximum kernel version. To retain the existing behavior we would need to add |
maybe you missed the part with protect zfs? the best case is when there's the conflict field and user choose to set zfs as protected |
Ah that's true, I was only thinking of the 2.2.x branch. Looks like with 2.3.x we'll have to be more aggressive with updating META and putting out Fedora releases. I'll approve it - but can you please update your commit message first to make checkstyle happy:
|
Commit comment fixed when merging. |
oh, whoops, I got carried away with work and didn't check github. Sorry about that |
The current "Requires" lines only ensure the old kernel is available on the system but it does not prevent fedora from updating to an incompatible and breaking user's system. Set Conflicts to block incompatible kernels from being installed. Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: Tony Hutter <[email protected]> Signed-off-by: tleydxdy <[email protected]> Closes openzfs#16139
The current "Requires" lines only ensure the old kernel is available on the system but it does not prevent fedora from updating to an incompatible and breaking user's system. Set Conflicts to block incompatible kernels from being installed. Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: Tony Hutter <[email protected]> Signed-off-by: tleydxdy <[email protected]> Closes openzfs#16139
The current "Requires" lines only ensure the old kernel is available on the system but it does not prevent fedora from updating to an incompatible and breaking user's system. Set Conflicts to block incompatible kernels from being installed. Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: Tony Hutter <[email protected]> Signed-off-by: tleydxdy <[email protected]> Closes openzfs#16139
Motivation and Context
The current "Requires" lines only ensure the old kernel is available on the system but it does not prevent fedora from updating to an incompatible and breaking user's system.
Description
Set Conflicts to block incompatible kernels from being installed.
How Has This Been Tested?
Waiting for build test
Types of changes
Checklist:
Signed-off-by
.