Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Incus isn't correctly merging large uid/gid ranges together #55

Open
dontlaugh opened this issue Sep 7, 2024 · 2 comments
Open

Incus isn't correctly merging large uid/gid ranges together #55

dontlaugh opened this issue Sep 7, 2024 · 2 comments

Comments

@dontlaugh
Copy link

dontlaugh commented Sep 7, 2024

Full disclosure, I don't recall editing either of these files for a long time. Probably the last time I touched them on this machine was years ago when setting up LXD.

In the time since, I have run incus migration scripts and everything's been fine. Today containers failed to start. I rebooted my computer and here is my entire shell session since then. It shows me trying to start some containers and checking the logs, which pointed me towards /etc/subuid and /etc/subgid

shell session debugging starting containers
coleman@augustus /home/coleman 
0 % df -h
Filesystem      Size  Used Avail Use% Mounted on
tmpfs           3.2G  2.3M  3.2G   1% /run
/dev/nvme0n1p3  462G  283G  176G  62% /
tmpfs            16G  280K   16G   1% /dev/shm
tmpfs           5.0M   16K  5.0M   1% /run/lock
efivarfs        128K   21K  103K  17% /sys/firmware/efi/efivars
/dev/nvme0n1p1  487M  448M   39M  93% /boot/efi
/dev/nvme0n1p3  462G  283G  176G  62% /home
tmpfs           3.2G  160K  3.2G   1% /run/user/1000
tmpfs           100K     0  100K   0% /var/lib/incus/shmounts
tmpfs           100K     0  100K   0% /var/lib/incus/guestapi
/dev/nvme0n1p3  462G  283G  176G  62% /var/lib/incus/storage-pools/default
                                                                                                                                                                                                                                                               
coleman@augustus /home/coleman 
0 % i ls 
+---------+---------+------+------+-----------+-----------+
|  NAME   |  STATE  | IPV4 | IPV6 |   TYPE    | SNAPSHOTS |
+---------+---------+------+------+-----------+-----------+
| stopper | STOPPED |      |      | CONTAINER | 0         |
+---------+---------+------+------+-----------+-----------+
                                                                                                                                                                                                                                                               
coleman@augustus /home/coleman 
0 % i start stopper
Error: Failed to run: /opt/incus/bin/incusd forkstart stopper /var/lib/incus/containers /run/incus/stopper/lxc.conf: exit status 1
Try `incus info --show-log stopper` for more info
                                                                                                                                                                                                                                                               
coleman@augustus /home/coleman 
1 % incus info --show-log stopper
Name: stopper
Status: STOPPED
Type: container
Architecture: x86_64
Created: 2024/09/07 18:35 EDT
Last Used: 2024/09/07 18:40 EDT

Log:

lxc stopper 20240907224024.186 ERROR    idmap_utils - ../src/lxc/idmap_utils.c:lxc_map_ids:245 - newuidmap failed to write mapping "newuidmap: write to uid_map failed: Invalid argument": newuidmap 3857 0 1000000 1000000000 0 1001000000 1000000000
lxc stopper 20240907224024.186 ERROR    start - ../src/lxc/start.c:lxc_spawn:1795 - Failed to set up id mapping.
lxc stopper 20240907224024.186 ERROR    lxccontainer - ../src/lxc/lxccontainer.c:wait_on_daemonized_start:837 - Received container state "ABORTING" instead of "RUNNING"
lxc stopper 20240907224024.187 ERROR    start - ../src/lxc/start.c:__lxc_start:2114 - Failed to spawn container "stopper"
lxc stopper 20240907224024.187 WARN     start - ../src/lxc/start.c:lxc_abort:1037 - No such process - Failed to send SIGKILL via pidfd 17 for process 3857
lxc 20240907224024.227 ERROR    af_unix - ../src/lxc/af_unix.c:lxc_abstract_unix_recv_fds_iov:218 - Connection reset by peer - Failed to receive response
lxc 20240907224024.227 ERROR    commands - ../src/lxc/commands.c:lxc_cmd_rsp_recv_fds:128 - Failed to receive file descriptors for command "get_init_pid"

                                                                                                                                                                                                                                                               
coleman@augustus /home/coleman 
0 % i launch images:ubuntu/22.04    
Launching the instance
Error: Failed instance creation: Failed to run: /opt/incus/bin/incusd forkstart able-weasel /var/lib/incus/containers /run/incus/able-weasel/lxc.conf: exit status 1
                                                                                                                                            
coleman@augustus /home/coleman 
1 % incus info --show-log able-weasel
Name: able-weasel
Status: STOPPED
Type: container
Architecture: x86_64
Created: 2024/09/07 18:43 EDT
Last Used: 2024/09/07 18:43 EDT

Log:

lxc able-weasel 20240907224348.346 ERROR    idmap_utils - ../src/lxc/idmap_utils.c:lxc_map_ids:245 - newuidmap failed to write mapping "newuidmap: write to uid_map failed: Invalid argument": newuidmap 5083 0 1000000 1000000000 0 1001000000 1000000000
lxc able-weasel 20240907224348.346 ERROR    start - ../src/lxc/start.c:lxc_spawn:1795 - Failed to set up id mapping.
lxc able-weasel 20240907224348.346 ERROR    lxccontainer - ../src/lxc/lxccontainer.c:wait_on_daemonized_start:837 - Received container state "ABORTING" instead of "RUNNING"
lxc able-weasel 20240907224348.347 ERROR    start - ../src/lxc/start.c:__lxc_start:2114 - Failed to spawn container "able-weasel"
lxc able-weasel 20240907224348.347 WARN     start - ../src/lxc/start.c:lxc_abort:1037 - No such process - Failed to send SIGKILL via pidfd 17 for process 5083
lxc 20240907224348.387 ERROR    af_unix - ../src/lxc/af_unix.c:lxc_abstract_unix_recv_fds_iov:218 - Connection reset by peer - Failed to receive response
lxc 20240907224348.387 ERROR    commands - ../src/lxc/commands.c:lxc_cmd_rsp_recv_fds:128 - Failed to receive file descriptors for command "get_init_pid"

                                                                                                                                            
coleman@augustus /home/coleman 
0 % groups                                
coleman adm cdrom sudo dip plugdev lpadmin lxd sambashare incus-admin
                                                                                                                                            
coleman@augustus /home/coleman 
0 % sudo usermod -aG incus coleman
[sudo] password for coleman: 
                                                                                                                                            
coleman@augustus /home/coleman 
0 % cat /etc/subuid
coleman:100000:65536
root:1000000:1000000000
root:1001000000:1000000000
                                                                                                                                            
coleman@augustus /home/coleman 
0 % cat /etc/subgid
coleman:100000:65536
root:1000000:1000000000
root:1001000000:1000000000
                                                                                                                                            
coleman@augustus /home/coleman 
0 % sudo kak /etc/subgid 
                                                                                                                                            
coleman@augustus /home/coleman 
0 % sudo kak /etc/subuid
                                                                                                                                            
coleman@augustus /home/coleman 
0 % i launch images:ubuntu/22.04 maptest
Launching maptest
Error: Failed instance creation: Failed to run: /opt/incus/bin/incusd forkstart maptest /var/lib/incus/containers /run/incus/maptest/lxc.conf: exit status 1
                                                                                                                                            
coleman@augustus /home/coleman 
1 % incus info --show-log maptest    
Name: maptest
Status: STOPPED
Type: container
Architecture: x86_64
Created: 2024/09/07 18:53 EDT
Last Used: 2024/09/07 18:53 EDT

Log:

lxc maptest 20240907225346.676 ERROR    idmap_utils - ../src/lxc/idmap_utils.c:lxc_map_ids:245 - newuidmap failed to write mapping "newuidmap: uid range [0-1000000000) -> [1001000000-2001000000) not allowed": newuidmap 5832 0 1000000 1000000000 0 1001000000 1000000000
lxc maptest 20240907225346.676 ERROR    start - ../src/lxc/start.c:lxc_spawn:1795 - Failed to set up id mapping.
lxc maptest 20240907225346.676 ERROR    lxccontainer - ../src/lxc/lxccontainer.c:wait_on_daemonized_start:837 - Received container state "ABORTING" instead of "RUNNING"
lxc maptest 20240907225346.676 ERROR    start - ../src/lxc/start.c:__lxc_start:2114 - Failed to spawn container "maptest"
lxc maptest 20240907225346.676 WARN     start - ../src/lxc/start.c:lxc_abort:1037 - No such process - Failed to send SIGKILL via pidfd 17 for process 5832
lxc 20240907225346.710 ERROR    af_unix - ../src/lxc/af_unix.c:lxc_abstract_unix_recv_fds_iov:218 - Connection reset by peer - Failed to receive response
lxc 20240907225346.710 ERROR    commands - ../src/lxc/commands.c:lxc_cmd_rsp_recv_fds:128 - Failed to receive file descriptors for command "get_init_pid"

                                                                                                                                            
coleman@augustus /home/coleman 
0 % sudo systemctl restart incus
                                                                                                                                            
coleman@augustus /home/coleman 
0 % incus info --show-log maptest2
Error: Failed to fetch instance "maptest2" in project "default": Instance not found
                                                                                                                                            
coleman@augustus /home/coleman 
1 % i launch images:ubuntu/22.04 maptest2
Launching maptest2

I got containers to start by removing the second entry for root in /etc/subuid and /etc/subgid

% cat /etc/subgid
coleman:100000:65536
root:1000000:1000000000
root:1001000000:1000000000  # deleted this

% cat /etc/subgid
coleman:100000:65536
root:1000000:1000000000
root:1001000000:1000000000  # and deleted this, too

I was running Zabbly stable, and then naively (instead of looking at logs like I did here after reboot), I upgraded to daily, which I'd been meaning to do anyway.

Is there any chance that this package manipulates /etc/subuid or /etc/subgid? The usermod --add-subiuds does look suspicious here. Is there an off-by-one error in the script?

usermod --add-subuids ${NEXT_UID}-$(($NEXT_UID+999999999)) root
usermod --add-subgids ${NEXT_GID}-$(($NEXT_GID+999999999)) root

Is this multiple-range setup even a valid config?

@dontlaugh
Copy link
Author

Ahh, check this out. I added 1 to the start of the second range for root, and incus faithfully started a container.

coleman@augustus /home/coleman 
0 % cat /etc/subgid
coleman:100000:65536
root:1000000:1000000000
root:1001000001:1000000000

coleman@augustus /home/coleman 
0 % cat /etc/subuid
coleman:100000:65536
root:1000000:1000000000
root:1001000001:1000000000

@stgraber
Copy link
Member

root:1000000:1000000000
root:1001000000:1000000000
lxc able-weasel 20240907224348.346 ERROR    idmap_utils - ../src/lxc/idmap_utils.c:lxc_map_ids:245 - newuidmap failed to write mapping "newuidmap: write to uid_map failed: Invalid argument": newuidmap 5083 0 1000000 1000000000 0 1001000000 1000000000

That's interesting, so here we can see Incus / LXC attempting to set up two maps starting at uid 0.
That obviously isn't going to work.

Instead I'd have expected to either have just the first map be picked (old LXD behavior) or the two maps be merged together, leading to 2000000000 uid/gid for that instance.

@stgraber stgraber changed the title Weird edgecase (?) with /etc/subuid and /etc/subgid Incus isn't correctly merging large uid/gid ranges together Sep 11, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants