Scaling up cluster can destroy /var/lib/docker mount on existing workers #18

bodom0015 · 2020-10-29T19:37:43Z

This one is sort of a rabbit hole, but I think I've figured out what is causing it.

While attempting to scale up the CHEESEhub production cluster from 1 worker to 2, things seemed to have gone smoothly. During testing, I noticed that some pods were failing to start and would remain in ContainerCreating. Describing the pod yielded an unfamiliar error message looping many times:

Events:
  Type     Reason                  Age               From               Message
  ----     ------                  ----              ----               -------
  Normal   Scheduled               2m46s             default-scheduler  Successfully assigned lambert8/szhq4c-sqlinjection-ntprk to cheesehub-worker0
  Warning  FailedCreatePodSandBox  2m46s             kubelet            Failed to create pod sandbox: rpc error: code = Unknown desc = failed to create a sandbox for pod "szhq4c-sqlinjection-ntprk": Error response from daemon: mkdir /var/lib/docker/overlay2/00bbc8fca42634c3bab11d005cb25e6d76a27a57a3cc59d780ea068867a0ce93-init: input/output error
  Warning  FailedCreatePodSandBox  2m33s             kubelet            Failed to create pod sandbox: rpc error: code = Unknown desc = failed to create a sandbox for pod "szhq4c-sqlinjection-ntprk": Error response from daemon: mkdir /var/lib/docker/overlay2/f6af26efbf925d8c41c82e816823ebc36310ec62ae343ae201faaea11937d9e2-init: input/output error

After much trial and error, I noticed that only pods scheduled on worker0 showed this error behavior. Digging into the specifics of that node, I looked at dmesg -T to find an odd error from the Docker daemon:

[Thu Oct 29 18:44:57 2020] EXT4-fs warning (device vdb): dx_probe:740: inode #2097153: lblock 0: comm dockerd: error -5 reading directory block
[Thu Oct 29 18:44:57 2020] EXT4-fs warning (device vdb): dx_probe:740: inode #2097153: lblock 0: comm dockerd: error -5 reading directory block
[Thu Oct 29 18:44:57 2020] EXT4-fs warning (device vdb): dx_probe:740: inode #2097153: lblock 0: comm dockerd: error -5 reading directory block
[Thu Oct 29 18:44:57 2020] EXT4-fs warning (device vdb): dx_probe:740: inode #2097153: lblock 0: comm dockerd: error -5 reading directory block
[Thu Oct 29 18:44:58 2020] EXT4-fs error (device vdb): __ext4_get_inode_loc:4115: inode #2889247: block 11534721: comm dockerd: unable to read itable block
[Thu Oct 29 18:44:58 2020] EXT4-fs warning (device vdb): dx_probe:740: inode #2097153: lblock 0: comm dockerd: error -5 reading directory block
[Thu Oct 29 18:44:58 2020] EXT4-fs warning (device vdb): dx_probe:740: inode #2097153: lblock 0: comm dockerd: error -5 reading directory block
[Thu Oct 29 18:45:00 2020] EXT4-fs error (device vdb): __ext4_get_inode_loc:4115: inode #2889247: block 11534721: comm dockerd: unable to read itable block
[Thu Oct 29 18:45:00 2020] EXT4-fs error (device vdb): __ext4_get_inode_loc:4115: inode #2889247: block 11534721: comm dockerd: unable to read itable block
[Thu Oct 29 18:45:00 2020] EXT4-fs error (device vdb): __ext4_get_inode_loc:4115: inode #2889345: block 11534728: comm dockerd: unable to read itable block
[Thu Oct 29 18:45:00 2020] EXT4-fs error (device vdb): __ext4_get_inode_loc:4115: inode #2889230: block 11534720: comm dockerd: unable to read itable block
[Thu Oct 29 18:45:00 2020] EXT4-fs error (device vdb): __ext4_get_inode_loc:4115: inode #2889211: block 11534719: comm dockerd: unable to read itable block

Attempting to restart Docker failed:

ubuntu@cheesehub-worker0:~$ sudo systemctl restart docker
Job for docker.service failed because the control process exited with error code. See "systemctl status docker.service" and "journalctl -xe" for details.
ubuntu@cheesehub-worker0:~$ systemctl status docker.service
● docker.service - Docker Application Container Engine
   Loaded: loaded (/lib/systemd/system/docker.service; enabled; vendor preset: enabled)
   Active: failed (Result: start-limit-hit) since Thu 2020-10-29 18:53:42 UTC; 257ms ago
     Docs: https://docs.docker.com
  Process: 1709 ExecStart=/usr/bin/dockerd -H fd:// --containerd=/run/containerd/containerd.sock (code=exited, status=1/FAILURE)
 Main PID: 1709 (code=exited, status=1/FAILURE)

Oct 29 18:53:40 cheesehub-worker0 systemd[1]: docker.service: Main process exited, code=exited, status=1/FAILURE
Oct 29 18:53:40 cheesehub-worker0 systemd[1]: Failed to start Docker Application Container Engine.
Oct 29 18:53:40 cheesehub-worker0 systemd[1]: docker.service: Unit entered failed state.
Oct 29 18:53:40 cheesehub-worker0 systemd[1]: docker.service: Failed with result 'exit-code'.
Oct 29 18:53:42 cheesehub-worker0 systemd[1]: docker.service: Service hold-off time over, scheduling restart.
Oct 29 18:53:42 cheesehub-worker0 systemd[1]: Stopped Docker Application Container Engine.
Oct 29 18:53:42 cheesehub-worker0 systemd[1]: docker.service: Start request repeated too quickly.
Oct 29 18:53:42 cheesehub-worker0 systemd[1]: Failed to start Docker Application Container Engine.
Oct 29 18:53:42 cheesehub-worker0 systemd[1]: docker.service: Unit entered failed state.
Oct 29 18:53:42 cheesehub-worker0 systemd[1]: docker.service: Failed with result 'start-limit-hit'.

I then tried to look into /var/lib/docker only to find that it wasn't even there:

ubuntu@cheesehub-worker0:~$ sudo su
root@cheesehub-worker0:/home/ubuntu# cd /var/lib/docker
root@cheesehub-worker0:/var/lib/docker# ls -al
ls: reading directory '.': Input/output error
total 0

Checked mount to see what should have been there:

root@cheesehub-worker0:/var/lib/docker# mount | grep /var/lib/docker
/dev/vdb on /var/lib/docker type ext4 (ro,relatime,data=ordered)

Looking at lsblk shows no device at vdb, and that this device name had somehow changed from vdb to vdc:

root@cheesehub-worker0:/var/lib/docker# lsblk
NAME   MAJ:MIN RM SIZE RO TYPE MOUNTPOINT
vda    253:0    0  20G  0 disk 
└─vda1 253:1    0  20G  0 part /
vdc    253:32   0  75G  0 disk

During this time, I should also note that OpenStack still reports the stale device name in their UI:

Name	Description	Size	Status	Group	Type	Attached To	Availability Zone	Bootable	Encrypted
cheesehub-worker0-docker	-	75GiB	In-use	-	default	/dev/vdb on cheesehub-worker0	nova	No	No

I was wondering how this could possibly have happened and the only logical explanation is that Terraform somehow clobbered this mount while scaling up the cluster.

Checking the Action Log in OpenStack, it looks like the docker volume was detached and reattached while scaling up. Without restarting the node, this would obviously be very bad for the cluster resources running here.

Request ID	Action	Start Time	User ID	Message
req-c6561e00-0a08-4b95-84d1-eb56b083cf55	attach_volume	Oct. 23, 2020, 9:04 p.m.	8ba875a177fdd5eca3646fd1acfafcb50e07787cd364a758ac878b69e601c6c1	-
req-641b194c-0c81-49ec-aff5-ff503ea9330f	detach_volume	Oct. 23, 2020, 9:01 p.m.	8ba875a177fdd5eca3646fd1acfafcb50e07787cd364a758ac878b69e601c6c1	-
req-ebbcb046-2dc0-432b-8f6f-e321333e988d	attach_volume	Sept. 26, 2020, 12:52 a.m.	8ba875a177fdd5eca3646fd1acfafcb50e07787cd364a758ac878b69e601c6c1	-
req-05e08230-4946-4ceb-8e25-6a2672992f2e	Create	Sept. 26, 2020, 12:51 a.m.	8ba875a177fdd5eca3646fd1acfafcb50e07787cd364a758ac878b69e601c6c1	-

I'm fairly certain this has to do with the imperative nature with which we've written the provisioning/bootstrapping side of the Terraform deployment.

In the short-term, rebooting the node resolved the issue with the mount.

In the long-term, this needs to be adjusted to account for existing nodes when scaling up. These mounts do not need to be touched, so we need to figure out why Terraform thinks they need to change and prevent it from making that change.

The text was updated successfully, but these errors were encountered:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Scaling up cluster can destroy /var/lib/docker mount on existing workers #18

Scaling up cluster can destroy /var/lib/docker mount on existing workers #18

bodom0015 commented Oct 29, 2020 •

edited

Loading

Scaling up cluster can destroy /var/lib/docker mount on existing workers #18

Scaling up cluster can destroy /var/lib/docker mount on existing workers #18

Comments

bodom0015 commented Oct 29, 2020 • edited Loading

bodom0015 commented Oct 29, 2020 •

edited

Loading