-
Notifications
You must be signed in to change notification settings - Fork 16
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Move to balenaOS v3+? #60
Comments
Hey @shaunco, do you want to give this branch a try that is using balenaOS v3.2.9? You can also change the device type from We are aware of the cgroup v2 limitations, but there are a number of obstacles to overcome:
If you've had any success with this feel free to let open a PR and we would be happy to merge the changes! |
Running in to some errors. Still digging, but figured I'd give an update... On a W11 machine running the latest WSL2 Kernel (and Docker for WSL):
Docker Desktop v24.0.5, build ced0996 Note that systemd is enabled via these instructions, and cgroup2+1 is enabled via these instructions. From W11: PS > docker info
...
Cgroup Driver: cgroupfs
Cgroup Version: 1
... From balenaOS: sh-5.1# cat /proc/filesystems
...
nodev cgroup
nodev cgroup2 I get a lot of errors in on the 3+ OS. Here is the log
cgroup mounts seem to have v1 and v2
The contents of `/sys/fs/cgroup`
The output from `systemd-cgls` on the host machine
Very excited for balena-os/balena-engine#439 |
So, it would appear I'm running into the note at https://www.man7.org/linux/man-pages/man7/kernel-command-line.7.html#HISTORY :
as the base linux image used by Docker Desktop in WSL2 is Alpine 3.18.2, which means setting Looks like I might have to wait for balena-os/balena-engine#439 to be merged to test this on my W11/WSL2 machine. I will setup a proper Ubuntu 18.04 VM tomorrow with docker and give this a shot in there. |
On a clean VM of Ubuntu 18.04 with just the latest docker installed, the base balena OS seems to startup all services except balena-supervisor, which is repeatedly failing with the following:
This matches the error of balena-os/balena-engine#435 and balena-os/balena-engine#436 ... but I don't see anything left behind the paths listed in balena-os/balena-engine#435 (comment) The contents of
|
@klutchell - with the runtime v2 change merged into balena-engine, I'll wait balena-os/meta-balena#3252 to be merged and then for balena-renovate to push that change into https://github.com/balena-os/balena-generic and then give the new builds a try. Not sure of the timing for that to propagate all the way through... |
Here is the PR waiting for merge to meta-balena. It's currently waiting on passing tests and code reviews. Once merged to meta-balena a PR will be automatically opened on balena-generic, and if the tests all pass it will automerge and deploy to production. If the tests fail for any reason we will need to investigate. |
@klutchell - I got back to trying this, and am using 4.1.5, but not having any luck getting it started and connected to balena. I've included the docker log below... let me know if there is anything that comes to mind I should try or extra info I can provide:
As for the failed services: sh-5.1# journalctl -u resin-boot.service
Nov 18 08:56:58 ughost systemd[813]: Failed to attach 813 to compat systemd cgroup /docker/9d126278d8c110b9cc8071f41b16e89533585a0716dd4b50051d3c1932b34c80/system.slice/resin-boot.service: No such file or directory
Nov 18 08:56:58 ughost systemd[1]: resin-boot.service: Main process exited, code=exited, status=1/FAILURE
Nov 18 08:56:58 ughost systemd[1]: resin-boot.service: Failed with result 'exit-code'.
Nov 18 08:56:58 ughost systemd[1]: Failed to start Resin boot partition mount service.
sh-5.1# journalctl -u resin-state.service
Nov 18 08:56:58 ughost systemd[816]: Failed to attach 816 to compat systemd cgroup /docker/9d126278d8c110b9cc8071f41b16e89533585a0716dd4b50051d3c1932b34c80/system.slice/resin-state.service: No such file or directory
Nov 18 08:56:58 ughost systemd[1]: resin-state.service: Main process exited, code=exited, status=1/FAILURE
Nov 18 08:56:58 ughost systemd[1]: resin-state.service: Failed with result 'exit-code'.
Nov 18 08:56:58 ughost systemd[1]: Failed to start Resin state partition mount service.
sh-5.1# journalctl -u resin-data.service
Nov 18 08:56:58 ughost systemd[718]: Failed to attach 718 to compat systemd cgroup /docker/9d126278d8c110b9cc8071f41b16e89533585a0716dd4b50051d3c1932b34c80/system.slice/resin-data.service: No such file or directory
Nov 18 08:56:58 ughost systemd[1]: resin-data.service: Main process exited, code=exited, status=1/FAILURE
Nov 18 08:56:58 ughost systemd[1]: resin-data.service: Failed with result 'exit-code'.
Nov 18 08:56:58 ughost systemd[1]: Failed to start Resin data partition mount service.= |
Hey @shaunco, does WSL include systemd at all? I'm guessing no since your engine is using cgroupfs? It may be that systemd in balenaOS is looking for systemd cgroups on the host and not finding any since you're in WSL. Maybe try creating some fake cgroup mounts before running the container? Here's something I did recently to get systemd-in-docker to work in a VM rootfs without systemd. |
[UPDATE: THIS WAS NOT ACTUALLY WORKING AS IT SHOULD, SEE MY LATER COMMENT] Hi @shaunco @klutchell I managed to get it working using:
Using the changes Kyle made here: #63 I am using an ubuntu x86 laptop:
|
Haven't gone deeply back in to the code but I wouldn't be surprised if
recent changes to
https://github.com/balena-os/balena-supervisor/commits/master/mount-partitions.sh
have inadvertently fixed this? :)
…On Fri, 24 May 2024, 3:39 pm Ryan Cooke, ***@***.***> wrote:
Hi @shaunco <https://github.com/shaunco> I managed to get it working
using:
- OS_VERSION : 5.1.10
- DEVICE_TYPE: generic-amd64
Using the changes Kyle made here: #63
<#63>
I am using an ubuntu x86 laptop:
***@***.***:~$ grep cgroup /proc/filesystems
nodev cgroup
nodev cgroup2
***@***.***:~$ docker info
Client:
Version: 24.0.5
Context: default
Server:
Containers: 0
Running: 0
Paused: 0
Stopped: 0
Images: 140
Server Version: 24.0.5
Storage Driver: overlay2
Backing Filesystem: extfs
Supports d_type: true
Using metacopy: false
Native Overlay Diff: true
userxattr: false
Logging Driver: json-file
Cgroup Driver: cgroupfs
Cgroup Version: 1
Plugins:
Volume: local
Network: bridge host ipvlan macvlan null overlay
Log: awslogs fluentd gcplogs gelf journald json-file local logentries splunk syslog
Swarm: inactive
Runtimes: io.containerd.runc.v2 runc
Default Runtime: runc
Init Binary: docker-init
containerd version:
runc version:
init version:
Security Options:
apparmor
seccomp
Profile: builtin
Kernel Version: 5.4.0-182-generic
Operating System: Ubuntu 20.04.2 LTS
OSType: linux
Architecture: x86_64
CPUs: 12
Total Memory: 15.27GiB
WARNING: No swap limit support
—
Reply to this email directly, view it on GitHub
<#60 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AB4TJUX53BCWE3KVJ5PBAT3ZD5GI3AVCNFSM6AAAAAA3RRNUYCVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDCMRZGY4TMOBYHE>
.
You are receiving this because you are subscribed to this thread.Message
ID: ***@***.***>
|
I tried both Not sure how you managed to get past this point @rcooke-warwick ! |
Update: This was only working for me because I had an older version of the supervisor in an out of date volume - which happened because I:
|
Checking back in on this as I just tried this with
Failed servies
resin-boot service logs: sh-5.1# systemctl status resin-boot.service
× resin-boot.service - Resin boot partition mount service
Loaded: loaded (/lib/systemd/system/resin-boot.service; enabled; vendor preset: enabled)
Active: failed (Result: exit-code) since Mon 2024-09-09 21:12:49 UTC; 2min 47s ago
Process: 807 ExecStart=/usr/bin/resin-partition-mounter --mount resin-boot (code=exited, status=1/FAILURE)
Main PID: 807 (code=exited, status=1/FAILURE)
CPU: 10ms
Sep 09 21:12:49 e836c93b73ca resin-partition-mounter[817]: od: /sys/firmware/efi/efivars/SecureBoot-8be4df61-93ca-11d2-aa0d-00e098032b8c: No such file or directory
Sep 09 21:12:49 e836c93b73ca resin-partition-mounter[807]: /usr/libexec/os-helpers-efi: line 32: test: : integer expression expected
Sep 09 21:12:49 e836c93b73ca systemd[1]: resin-boot.service: Main process exited, code=exited, status=1/FAILURE
Sep 09 21:12:49 e836c93b73ca systemd[1]: resin-boot.service: Failed with result 'exit-code'.
Sep 09 21:12:49 e836c93b73ca systemd[1]: Failed to start Resin boot partition mount service. resin-state service logs: sh-5.1# systemctl status resin-state.service
× resin-state.service - Resin state partition mount service
Loaded: loaded (/lib/systemd/system/resin-state.service; enabled; vendor preset: enabled)
Active: failed (Result: exit-code) since Mon 2024-09-09 21:12:50 UTC; 4min 11s ago
Process: 808 ExecStart=/usr/bin/resin-partition-mounter --mount resin-state (code=exited, status=1/FAILURE)
Main PID: 808 (code=exited, status=1/FAILURE)
CPU: 10ms
Sep 09 21:12:49 e836c93b73ca resin-partition-mounter[821]: od: /sys/firmware/efi/efivars/SecureBoot-8be4df61-93ca-11d2-aa0d-00e098032b8c: No such file or directory
Sep 09 21:12:49 e836c93b73ca resin-partition-mounter[808]: /usr/libexec/os-helpers-efi: line 32: test: : integer expression expected
Sep 09 21:12:50 e836c93b73ca systemd[1]: resin-state.service: Main process exited, code=exited, status=1/FAILURE
Sep 09 21:12:50 e836c93b73ca systemd[1]: resin-state.service: Failed with result 'exit-code'.
Sep 09 21:12:50 e836c93b73ca systemd[1]: Failed to start Resin state partition mount service. resin-data service logs: sh-5.1# systemctl status resin-data.service
× resin-data.service - Resin data partition mount service
Loaded: loaded (/lib/systemd/system/resin-data.service; enabled; vendor preset: enabled)
Active: failed (Result: exit-code) since Mon 2024-09-09 21:12:49 UTC; 7min ago
Process: 728 ExecStart=/usr/bin/resin-partition-mounter --mount resin-data (code=exited, status=1/FAILURE)
Main PID: 728 (code=exited, status=1/FAILURE)
CPU: 12ms
Sep 09 21:12:49 e836c93b73ca resin-partition-mounter[733]: od: /sys/firmware/efi/efivars/SecureBoot-8be4df61-93ca-11d2-aa0d-00e098032b8c: No such file or directory
Sep 09 21:12:49 e836c93b73ca resin-partition-mounter[728]: /usr/libexec/os-helpers-efi: line 32: test: : integer expression expected
Sep 09 21:12:49 e836c93b73ca systemd[1]: resin-data.service: Main process exited, code=exited, status=1/FAILURE
Sep 09 21:12:49 e836c93b73ca systemd[1]: resin-data.service: Failed with result 'exit-code'.
Sep 09 21:12:49 e836c93b73ca systemd[1]: Failed to start Resin data partition mount service. mnt-sysroot-active logs: sh-5.1# systemctl status mnt-sysroot-active.service
× mnt-sysroot-active.service - Resin active root partition mount service
Loaded: loaded (/lib/systemd/system/mnt-sysroot-active.service; static)
Active: failed (Result: exit-code) since Mon 2024-09-09 21:12:49 UTC; 8min ago
Process: 20 ExecStart=/usr/bin/resin-partition-mounter --sysroot --mount active (code=exited, status=1/FAILURE)
Main PID: 20 (code=exited, status=1/FAILURE)
CPU: 368ms
Sep 09 21:12:19 e836c93b73ca resin-partition-mounter[36]: od: /sys/firmware/efi/efivars/SecureBoot-8be4df61-93ca-11d2-aa0d-00e098032b8c: No such file or directory
Sep 09 21:12:19 e836c93b73ca resin-partition-mounter[20]: /usr/libexec/os-helpers-efi: line 32: test: : integer expression expected
Sep 09 21:12:49 e836c93b73ca resin-partition-mounter[20]: ERROR: Timeout while waiting for /dev/disk/by-state/active to come up.
Sep 09 21:12:49 e836c93b73ca systemd[1]: mnt-sysroot-active.service: Main process exited, code=exited, status=1/FAILURE
Sep 09 21:12:49 e836c93b73ca systemd[1]: mnt-sysroot-active.service: Failed with result 'exit-code'.
Sep 09 21:12:49 e836c93b73ca systemd[1]: Failed to start Resin active root partition mount service.
Notice: journal has been rotated since unit was started, output may be incomplete. dnsmasq logs: sh-5.1# systemctl status dnsmasq.service
× dnsmasq.service - DNS forwarder and DHCP server
Loaded: loaded (/lib/systemd/system/dnsmasq.service; enabled; vendor preset: enabled)
Drop-In: /etc/systemd/system/dnsmasq.service.d
└─dnsmasq-conf.conf, dnsmasq.conf
Active: failed (Result: exit-code) since Mon 2024-09-09 21:12:50 UTC; 9min ago
Process: 826 ExecStartPre=/usr/bin/dnsmasq --test (code=exited, status=0/SUCCESS)
Process: 827 ExecStartPre=/usr/sbin/gen-conf-unit dnsmasq (code=exited, status=0/SUCCESS)
Process: 835 ExecStart=/usr/bin/dnsmasq -x /run/dnsmasq.pid -a 127.0.0.2,10.114.102.1 -7 /etc/dnsmasq.d/ -r /etc/resolv.dnsmasq -z --servers-file=/run/dnsmasq.servers -k --log-facility=- (code=exited, status=2)
Main PID: 835 (code=exited, status=2)
CPU: 12ms
Sep 09 21:12:50 e836c93b73ca dnsmasq[826]: dnsmasq: syntax check OK.
Sep 09 21:12:50 e836c93b73ca gen-conf-unit[832]: od: /sys/firmware/efi/efivars/SecureBoot-8be4df61-93ca-11d2-aa0d-00e098032b8c: No such file or directory
Sep 09 21:12:50 e836c93b73ca gen-conf-unit[827]: /usr/libexec/os-helpers-efi: line 32: test: : integer expression expected
Sep 09 21:12:50 e836c93b73ca dnsmasq[835]: dnsmasq: failed to create listening socket for 10.114.102.1: Cannot assign requested address
Sep 09 21:12:50 e836c93b73ca dnsmasq[835]: failed to create listening socket for 10.114.102.1: Cannot assign requested address
Sep 09 21:12:50 e836c93b73ca dnsmasq[835]: FAILED to start up
Sep 09 21:12:50 e836c93b73ca systemd[1]: dnsmasq.service: Main process exited, code=exited, status=2/INVALIDARGUMENT
Sep 09 21:12:50 e836c93b73ca systemd[1]: dnsmasq.service: Failed with result 'exit-code'. |
I had to switch from Looks like sh-5.1# /usr/bin/resin-partition-mounter --sysroot --mount active
ERROR: Timeout while waiting for /dev/disk/by-state/active to come up. Given boot/data/state mounts are bind mounted via docker, it would seem this script could just do: .
.
.
is_docker() {
if [ -f /.dockerenv ] || [ -f /run/.containerenv ]; then
return 0
fi
if grep -q 'docker' /proc/self/cgroup 2>/dev/null; then
return 0
fi
return 1
}
# Detect if running inside Docker and if --sysroot is passed
if [ "$sysroot" = "yes" ] && is_docker; then
echo "INFO: Detected Docker environment with --sysroot flag. Mounting / to /mnt/sysroot/active."
if [ ! -d /mnt/sysroot ]; then
mkdir -p /mnt/sysroot
fi
if [ ! -d /mnt/sysroot/active ]; then
mkdir -p /mnt/sysroot/active
fi
mount --bind / /mnt/sysroot/active || ln -s / /mnt/sysroot/active
exit 0
fi
# Detect if running inside Docker and if --sysroot is passed
if is_docker; then
exit 0
fi
# shellcheck disable=SC1091
.
.
. ... doing that got things mostly going, however the supervisor container is giving:
and
|
This project currently pulls down https://hub.docker.com/r/resin/resinos version 2.95.12+rev1. The last x86 resinOS published to Docker Hub was 2.99.27_rev2-genericx86-64-ext - over a year ago.
resinOS was renamed to balenaOS almost 5 years ago.
Since there is no balenaOS containers published to docker hub, the balenalib -run containers seem like they might provide the correct path forward into balenaOS v3+ ... but there are 29 to choose from and I need to get to the bottom of how to get one to actually behave like balenaOS. The other great thing here is that there are ARM images, so this project can eventually support both x86 and ARM.
Questions:
2. Does v3 fix the cgroup2 issue?no cgroup v2 supportThe text was updated successfully, but these errors were encountered: