[GH-ISSUE #4734] Error: cannot join namespace user #2767

Closed
opened 2026-05-05 09:25:57 -06:00 by gitea-mirror · 8 comments
Owner

Originally created by @minus7 on GitHub (Dec 2, 2021).
Original GitHub issue: https://github.com/netblue30/firejail/issues/4734

Description

Sending a link to open to Firefox running in firejail via firejail --join=ff2 firefox http://localhost sometimes starts failing permanently for that target firejail with the error message Error: cannot join namespace user. I usually observe that when I open multiple links in quick succession.

I have seen this happen for a long time, but very rarely (less than once a month). Looking at the call site, this seems like it's a bug in firejail but rather in the kernel. Wondering if anyone else has come across that and, like me, just hasn't posted this yet.

Steps to Reproduce

I haven't tried reproducing this with a minimal test case yet, so I'll leave my setup as reference:

Firejail is started with the following command (also used to open links, but the command provided mentioned above also fails) is used to start the firejail:

firejail \
	--ignore=seccomp '--seccomp=!kcmp,!chroot' \
	--profile=$HOME/.dotfiles/firejail/firefox.profile \
	--net=br-vpn \
	--dns=8.8.8.8 \
	--ip=10.14.38.130 --netmask=255.255.255.128 \
	--defaultgw=10.14.38.129 \
	--mtu=1420 \
	--hosts-file=/home/minus/tmp/ff2.hosts \
	--join-or-start=ff2 \
	firefox "$@"

br-vpn is a bridge with a WireGuard interface attached to it. I assume most of it is irrelevant though.

Environment

  • Arch Linux with kernel 5.15.5-arch1-1 but has also happened on previous versions
  • Firejail package version 0.9.66-3

Checklist

  • The issues is caused by firejail (i.e. running the program by path (e.g. /usr/bin/vlc) "fixes" it).
  • I can reproduce the issue without custom modifications (e.g. globals.local).
  • I have performed a short search for similar issues (to avoid opening a duplicate).
Originally created by @minus7 on GitHub (Dec 2, 2021). Original GitHub issue: https://github.com/netblue30/firejail/issues/4734 ### Description Sending a link to open to Firefox running in firejail via `firejail --join=ff2 firefox http://localhost` sometimes starts failing permanently for that target firejail with the error message `Error: cannot join namespace user`. I usually observe that when I open multiple links in quick succession. I have seen this happen for a long time, but very rarely (less than once a month). Looking at the [call site](https://github.com/netblue30/firejail/blob/0.9.66/src/firejail/join.c#L496), this seems like it's a bug in firejail but rather in the kernel. Wondering if anyone else has come across that and, like me, just hasn't posted this yet. ### Steps to Reproduce I haven't tried reproducing this with a minimal test case yet, so I'll leave my setup as reference: Firejail is started with the following command (also used to open links, but the command provided mentioned above also fails) is used to start the firejail: ```sh firejail \ --ignore=seccomp '--seccomp=!kcmp,!chroot' \ --profile=$HOME/.dotfiles/firejail/firefox.profile \ --net=br-vpn \ --dns=8.8.8.8 \ --ip=10.14.38.130 --netmask=255.255.255.128 \ --defaultgw=10.14.38.129 \ --mtu=1420 \ --hosts-file=/home/minus/tmp/ff2.hosts \ --join-or-start=ff2 \ firefox "$@" ``` br-vpn is a bridge with a WireGuard interface attached to it. I assume most of it is irrelevant though. ### Environment - Arch Linux with kernel 5.15.5-arch1-1 but has also happened on previous versions - Firejail package version 0.9.66-3 ### Checklist - [ ] The issues is caused by firejail (i.e. running the program by path (e.g. `/usr/bin/vlc`) "fixes" it). - [ ] I can reproduce the issue without custom modifications (e.g. globals.local). - [x] I have performed a short search for similar issues (to avoid opening a duplicate).
gitea-mirror 2026-05-05 09:25:57 -06:00
  • closed this issue
  • added the
    stale
    label
Author
Owner

@rusty-snake commented on GitHub (Dec 2, 2021):

Maybe related: #4543

<!-- gh-comment-id:984969936 --> @rusty-snake commented on GitHub (Dec 2, 2021): Maybe related: #4543
Author
Owner

@minus7 commented on GitHub (Dec 2, 2021):

Unlikely, since Firefox continues working just fine, and invocations don't even get to running firefox. On the other hand, if running firejail simultaneously can somehow lead to breaking the namespace setup, that could be a possible explanation.

Joining the user namespace also fails with nsenter with a different message (presumably from strerror(3)), but unfortuantely I don't have that at hand now

<!-- gh-comment-id:984980649 --> @minus7 commented on GitHub (Dec 2, 2021): Unlikely, since Firefox continues working just fine, and invocations don't even get to running `firefox`. On the other hand, if running firejail simultaneously can somehow lead to breaking the namespace setup, that could be a possible explanation. Joining the user namespace also fails with `nsenter` with a different message (presumably from `strerror(3)`), but unfortuantely I don't have that at hand now
Author
Owner

@minus7 commented on GitHub (Dec 2, 2021):

Inspired by https://github.com/netblue30/firejail/discussions/4538#discussioncomment-1314540, I tried reproducing it using:

firejail --name=test --net=br-vpn --dns=8.8.8.8 --ip=10.14.38.131 --netmask=255.255.255.128 --defaultgw=10.14.38.129 --mtu=1420 --hosts-file=/home/minus/tmp/ff2.hosts --ignore=seccomp '--seccomp=!kcmp,!chroot' --profile=$HOME/.dotfiles/firejail/firefox.profile bash

and

while true; do firejail --join=test bash -c 'exit' & firejail --join=test bash -c 'exit' & firejail --join=test bash -c 'exit' & firejail --join=test bash -c 'exit' & firejail --join=test bash -c 'exit' & firejail --join=test bash -c 'exit' & firejail --join=test bash -c 'exit' ; done

but without success

<!-- gh-comment-id:985011136 --> @minus7 commented on GitHub (Dec 2, 2021): Inspired by https://github.com/netblue30/firejail/discussions/4538#discussioncomment-1314540, I tried reproducing it using: ```sh firejail --name=test --net=br-vpn --dns=8.8.8.8 --ip=10.14.38.131 --netmask=255.255.255.128 --defaultgw=10.14.38.129 --mtu=1420 --hosts-file=/home/minus/tmp/ff2.hosts --ignore=seccomp '--seccomp=!kcmp,!chroot' --profile=$HOME/.dotfiles/firejail/firefox.profile bash ``` and ```sh while true; do firejail --join=test bash -c 'exit' & firejail --join=test bash -c 'exit' & firejail --join=test bash -c 'exit' & firejail --join=test bash -c 'exit' & firejail --join=test bash -c 'exit' & firejail --join=test bash -c 'exit' & firejail --join=test bash -c 'exit' ; done ``` but without success
Author
Owner

@ghost commented on GitHub (Dec 2, 2021):

Would you mind posting your $HOME/.dotfiles/firejail/firefox.profile here please? If you happen to have a globals.local and a non-default firejail.config it would help to see those too, so we can try to follow/reproduce what exactly happens on your side. There are other ways to open links in a firejailed firefox, but let's try to get a clear picture of this specific issue before going there.

<!-- gh-comment-id:985084011 --> @ghost commented on GitHub (Dec 2, 2021): Would you mind posting your `$HOME/.dotfiles/firejail/firefox.profile` here please? If you happen to have a `globals.local` and a non-default `firejail.config` it would help to see those too, so we can try to follow/reproduce what exactly happens on your side. There are other ways to open links in a firejailed firefox, but let's try to get a clear picture of this specific issue before going there.
Author
Owner

@minus7 commented on GitHub (Dec 3, 2021):

The firefox.profile only has mkdir/whitelist to add some dirs and otherwise only includes /etc/firejail/firefox.profile. I have not made any other modifications to the config (no globals.local and firejail.config is all commented out)

<!-- gh-comment-id:985497282 --> @minus7 commented on GitHub (Dec 3, 2021): The `firefox.profile` only has `mkdir`/`whitelist` to add some dirs and otherwise only includes `/etc/firejail/firefox.profile`. I have not made any other modifications to the config (no `globals.local` and `firejail.config` is all commented out)
Author
Owner

@minus7 commented on GitHub (Dec 5, 2021):

Okay, it happened again and this time I compared a strace to a working instance. Relevant parts here: broken working (joining firejail's pids changed to 13337/13338 for better diffability; the primary child processes' pids are unchanged: 2715 in the broken one, 21888 in the working one)

The weird difference here is the chroot. It fails on the working instance: Firejail tries to chroot to the root of process 21888, which is the bash child process started with firejail --name=test […] bash. This obviously fails, since this chroot is executed after joining the PID namespace, so PID 21888 doesn't exist (or rather, it's PID 1). I'm not sure if Firejail does that on purpose or if that's a bug, but it is why strange things happen on the broken instance:
Here the chroot succeeds although it shouldn't. Entering the broken firejail works with nsenter if you don't try to join the user namespace: sudo nsenter -i/proc/2715/ns/ipc -n/proc/2715/ns/net -p/proc/2715/ns/pid -u/proc/2715/ns/uts -m/proc/2715/ns/mnt bash. Looking around there for a bit we can see the following:

$ sudo nsenter -i/proc/2715/ns/ipc -n/proc/2715/ns/net -p/proc/2715/ns/pid -u/proc/2715/ns/uts -m/proc/2715/ns/mnt bash 
# ls /proc | grep 2715
# ls /proc/2715
arch_status  cmdline		 environ  latency    mountinfo	 oom_adj	root	   smaps_rollup  task
attr	     comm		 exe	  limits     mounts	 oom_score	sched	   stack	 timens_offsets
autogroup    coredump_filter	 fd	  loginuid   mountstats  oom_score_adj	schedstat  stat		 timers
auxv	     cpu_resctrl_groups  fdinfo   map_files  net	 pagemap	sessionid  statm	 timerslack_ns
cgroup	     cpuset		 gid_map  maps	     ns		 personality	setgroups  status	 uid_map
clear_refs   cwd		 io	  mem	     numa_maps	 projid_map	smaps	   syscall	 wchan
# ls -l /proc/2715/root
lrwxrwxrwx 1 minus minus 0 Dec  5 13:46 /proc/2715/root -> '/proc/2712/fdinfo (deleted)'
# chroot /proc/2715/root /bin/sh
chroot: failed to run command ‘/bin/sh’: No such file or directory

The outside PID, while not listed in the directory listing of /proc, is accessible (gives the expect 'No such file or directory' in the working one), and more, its root seems to be broken or inaccessible, so there's obviously not /proc/1/ns/user in there to join, which explains the ENOENT in the strace.

Looking in the broken /proc further, there's a PID 2710, which oddly seems to have the same cmdline as the 2715 ghost process (which is very odd unless firefox starts two processes with the same childID argument), but stat says it's something else:

# ls -l /proc/2710/root
lrwxrwxrwx 1 minus minus 0 Dec  5 14:46 /proc/2710/root -> '/proc/2712/fdinfo (deleted)'
# ls -l /proc/2715/root
lrwxrwxrwx 1 minus minus 0 Dec  5 13:46 /proc/2715/root -> '/proc/2712/fdinfo (deleted)'
# cat /proc/2710/cmdline | tr '\0' ' ' ; echo
/usr/lib/firefox/firefox -contentproc -childID 29 -isForBrowser -prefsLen 8807 -prefMapSize 446731 -jsInit 278680 -parentBuildID 20211121002925 -appdir /usr/lib/firefox/browser 18 true tab 
# cat /proc/2715/cmdline | tr '\0' ' ' ; echo
/usr/lib/firefox/firefox -contentproc -childID 29 -isForBrowser -prefsLen 8807 -prefMapSize 446731 -jsInit 278680 -parentBuildID 20211121002925 -appdir /usr/lib/firefox/browser 18 true tab 
# cat /proc/2710/stat
2710 (Isolated Web Co) S 18 0 0 0 -1 4194560 25103 0 9275 0 107 49 0 0 25 5 27 0 606643 2549542912 32593 18446744073709551615 94792987556192 94792988068928 140732538040224 0 0 0 0 69634 1082133752 0 0 0 17 23 0 0 0 0 0 94792988080816 94792988080912 94793002209280 140732538043776 140732538043965 140732538043965 140732538048479 0
[root@phoenix /]# cat /proc/2715/stat
2715 (Socket Thread) S 18 0 0 0 -1 4194368 25103 0 9275 0 107 49 0 0 25 5 27 0 606648 2549542912 32593 18446744073709551615 94792987556192 94792988068928 140732538040224 0 0 0 0 69634 1082133752 0 0 0 -1 12 0 0 0 0 0 94792988080816 94792988080912 94793002209280 140732538043776 140732538043965 140732538043965 140732538048479 0

Looks like that 2715 process just points at some garbage data, and the fact that it has the same PID as the first firefox process on the host surely isn't a coincidence.

Edit: after patching out the chroot call in firejail, it works

<!-- gh-comment-id:986236454 --> @minus7 commented on GitHub (Dec 5, 2021): Okay, it happened again and this time I compared a strace to a working instance. Relevant parts here: [broken](https://github.com/netblue30/firejail/files/7655914/bork.txt) [working](https://github.com/netblue30/firejail/files/7655915/ok.txt) (joining firejail's pids changed to 13337/13338 for better diffability; the primary child processes' pids are unchanged: 2715 in the broken one, 21888 in the working one) The weird difference here is the `chroot`. It fails on the working instance: Firejail tries to chroot to the root of process 21888, which is the bash child process started with `firejail --name=test […] bash`. This obviously fails, since this chroot is executed after joining the PID namespace, so PID 21888 doesn't exist (or rather, it's PID 1). I'm not sure if Firejail does that on purpose or if that's a bug, but it is why strange things happen on the broken instance: Here the chroot succeeds although it shouldn't. Entering the broken firejail works with `nsenter` if you don't try to join the user namespace: `sudo nsenter -i/proc/2715/ns/ipc -n/proc/2715/ns/net -p/proc/2715/ns/pid -u/proc/2715/ns/uts -m/proc/2715/ns/mnt bash`. Looking around there for a bit we can see the following: ``` $ sudo nsenter -i/proc/2715/ns/ipc -n/proc/2715/ns/net -p/proc/2715/ns/pid -u/proc/2715/ns/uts -m/proc/2715/ns/mnt bash # ls /proc | grep 2715 # ls /proc/2715 arch_status cmdline environ latency mountinfo oom_adj root smaps_rollup task attr comm exe limits mounts oom_score sched stack timens_offsets autogroup coredump_filter fd loginuid mountstats oom_score_adj schedstat stat timers auxv cpu_resctrl_groups fdinfo map_files net pagemap sessionid statm timerslack_ns cgroup cpuset gid_map maps ns personality setgroups status uid_map clear_refs cwd io mem numa_maps projid_map smaps syscall wchan # ls -l /proc/2715/root lrwxrwxrwx 1 minus minus 0 Dec 5 13:46 /proc/2715/root -> '/proc/2712/fdinfo (deleted)' # chroot /proc/2715/root /bin/sh chroot: failed to run command ‘/bin/sh’: No such file or directory ``` The outside PID, while not listed in the directory listing of /proc, is accessible (gives the expect 'No such file or directory' in the working one), and more, its root seems to be broken or inaccessible, so there's obviously not /proc/1/ns/user in there to join, which explains the `ENOENT` in the strace. Looking in the broken /proc further, there's a PID 2710, which oddly seems to have the same cmdline as the 2715 ghost process (which is very odd unless firefox starts two processes with the same childID argument), but stat says it's something else: ``` # ls -l /proc/2710/root lrwxrwxrwx 1 minus minus 0 Dec 5 14:46 /proc/2710/root -> '/proc/2712/fdinfo (deleted)' # ls -l /proc/2715/root lrwxrwxrwx 1 minus minus 0 Dec 5 13:46 /proc/2715/root -> '/proc/2712/fdinfo (deleted)' # cat /proc/2710/cmdline | tr '\0' ' ' ; echo /usr/lib/firefox/firefox -contentproc -childID 29 -isForBrowser -prefsLen 8807 -prefMapSize 446731 -jsInit 278680 -parentBuildID 20211121002925 -appdir /usr/lib/firefox/browser 18 true tab # cat /proc/2715/cmdline | tr '\0' ' ' ; echo /usr/lib/firefox/firefox -contentproc -childID 29 -isForBrowser -prefsLen 8807 -prefMapSize 446731 -jsInit 278680 -parentBuildID 20211121002925 -appdir /usr/lib/firefox/browser 18 true tab # cat /proc/2710/stat 2710 (Isolated Web Co) S 18 0 0 0 -1 4194560 25103 0 9275 0 107 49 0 0 25 5 27 0 606643 2549542912 32593 18446744073709551615 94792987556192 94792988068928 140732538040224 0 0 0 0 69634 1082133752 0 0 0 17 23 0 0 0 0 0 94792988080816 94792988080912 94793002209280 140732538043776 140732538043965 140732538043965 140732538048479 0 [root@phoenix /]# cat /proc/2715/stat 2715 (Socket Thread) S 18 0 0 0 -1 4194368 25103 0 9275 0 107 49 0 0 25 5 27 0 606648 2549542912 32593 18446744073709551615 94792987556192 94792988068928 140732538040224 0 0 0 0 69634 1082133752 0 0 0 -1 12 0 0 0 0 0 94792988080816 94792988080912 94793002209280 140732538043776 140732538043965 140732538043965 140732538048479 0 ``` Looks like that 2715 process just points at some garbage data, and the fact that it has the same PID as the first firefox process on the host surely isn't a coincidence. Edit: after patching out the chroot call in firejail, it works
Author
Owner

@rusty-snake commented on GitHub (Jun 8, 2022):

firejail 0.9.70 has improved join code, can you test if this still happens with it when it is released.

<!-- gh-comment-id:1149808494 --> @rusty-snake commented on GitHub (Jun 8, 2022): firejail 0.9.70 has improved join code, can you test if this still happens with it when it is released.
Author
Owner

@rusty-snake commented on GitHub (Oct 30, 2022):

I'm closing here due to inactivity, please fell free to request to reopen if you still have this issue.

<!-- gh-comment-id:1296299629 --> @rusty-snake commented on GitHub (Oct 30, 2022): I'm closing here due to inactivity, please fell free to request to reopen if you still have this issue.
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference: github-starred/firejail#2767
No description provided.