[GH-ISSUE #2985] Problems with chroot and user namespaces #1869

Open
opened 2026-05-05 08:32:08 -06:00 by gitea-mirror · 1 comment
Owner

Originally created by @zb3 on GitHub (Oct 2, 2019).
Original GitHub issue: https://github.com/netblue30/firejail/issues/2985

Firejail uses chroot() when mounting overlayfs and when using the --chroot option. But using chroot() prevents the sandboxed process from being able to create user namespaces. So while

firejail --noprofile unshare -U id

works,

firejail --noprofile --overlay-tmpfs unshare -U id

doesn't, because firejail uses chroot, and the sandboxed process can't create user namespaces anymore.

Similarily, chromium without SUID sandbox (which will be removed in the future) doesn't work:

firejail --noprofile --overlay-tmpfs chromium --disable-setuid-sandbox

Could pivot_root() be used instead? Other sandbox programs use it instead of chroot, and since it updates current mount namespace root, sandboxed programs can still create their own user namespaces.

pivot_root requires that at least:

  • new mount namespace is used (it seems that firejail already uses it)
  • target root directory is not on the same filesystem (for overlay it's obviously not, while for chroot, target seems to be bind-mounted anyway)

Here's a PoC patch I've made to check whether pivot_root can work here. While I got unshare -U and chromium --disable-setuid-sandbox to work with --overlay-tmpfs, I have no idea what other side effects this introduces (and whether firejail even works beyond these simple use cases), so I'll just drop this here:

diff --git a/src/firejail/firejail.h b/src/firejail/firejail.h
index a6377261..482ffdd5 100644
--- a/src/firejail/firejail.h
+++ b/src/firejail/firejail.h
@@ -490,6 +490,7 @@ int safe_fd(const char *path, int flags);
 int invalid_sandbox(const pid_t pid);
 int has_handler(pid_t pid, int signal);
 void enter_network_namespace(pid_t pid);
+int pivot_root(const char *new_root, const char *put_old);
 
 // Get info regarding the last kernel mount operation from /proc/self/mountinfo
 // The return value points to a static area, and will be overwritten by subsequent calls.
diff --git a/src/firejail/fs.c b/src/firejail/fs.c
index ce2ca5e2..725243ca 100644
--- a/src/firejail/fs.c
+++ b/src/firejail/fs.c
@@ -852,6 +852,35 @@ char *fs_check_overlay_dir(const char *subdirname, int allow_reuse) {
 // # exit
 // # umount /root/overlay/root
 
+// new_root must not be on the same filesystem as the current root
+void change_root_in_ns(const char *new_root) {
+	int oldroot = open("/", O_DIRECTORY | O_RDONLY);
+
+	if (oldroot < 0)
+		errExit("open");
+
+	if (chdir(new_root) < 0)
+		errExit("chdir");
+
+	if (pivot_root(".", ".") < 0)
+		errExit("pivot_root");
+
+	// must umount the old root mounted on top the new one
+	if (fchdir(oldroot) < 0)
+		errExit("fchdir");
+
+	close(oldroot);
+
+	// the old root must not be shared
+	if (mount("", ".", "", MS_SLAVE | MS_REC, NULL) < 0)
+		errExit("mount");
+
+	if (umount2(".", MNT_DETACH) < 0)
+		errExit("umount2");
+
+	if (chdir("/") == -1)
+		errExit("chdir");
+}
 
 // to do: fix the code below; also, it might work without /dev, but consider keeping /dev/shm; add locking mechanism for overlay-clean
 #include <sys/utsname.h>
@@ -1089,8 +1118,7 @@ void fs_overlayfs(void) {
 #ifdef HAVE_GCOV
 	__gcov_flush();
 #endif
-	if (chroot(oroot) == -1)
-		errExit("chroot");
+	change_root_in_ns(oroot);
 
 	// update /var directory in order to support multiple sandboxes running on the same root directory
 //	if (!arg_private_dev)
@@ -1344,8 +1372,7 @@ void fs_chroot(const char *rootdir) {
 		errExit("mkdir");
 	if (mount(rootdir, oroot, NULL, MS_BIND|MS_REC, NULL) < 0)
 		errExit("mounting rootdir oroot");
-	if (chroot(oroot) < 0)
-		errExit("chroot");
+	change_root_in_ns(oroot);
 
 	// create all other /run/firejail files and directories
 	preproc_build_firejail_dir();
diff --git a/src/firejail/util.c b/src/firejail/util.c
index 4634993d..cfc4db3e 100644
--- a/src/firejail/util.c
+++ b/src/firejail/util.c
@@ -35,6 +35,8 @@
 # define O_PATH 010000000
 #endif
 
+#include <syscall.h>
+
 #define MAX_GROUPS 1024
 #define MAXBUF 4098
 #define EMPTY_STRING ("")
@@ -1330,3 +1332,12 @@ void enter_network_namespace(pid_t pid) {
 		exit(1);
 	}
 }
+
+int pivot_root(const char *new_root, const char *put_old) {
+#ifdef __NR_pivot_root
+	return syscall (__NR_pivot_root, new_root, put_old);
+#else
+	errno = ENOSYS;
+	return -1;
+#endif
+}
Originally created by @zb3 on GitHub (Oct 2, 2019). Original GitHub issue: https://github.com/netblue30/firejail/issues/2985 Firejail uses `chroot()` when mounting overlayfs and when using the `--chroot` option. But using `chroot()` prevents the sandboxed process from being able to create user namespaces. So while ``` firejail --noprofile unshare -U id ``` works, ``` firejail --noprofile --overlay-tmpfs unshare -U id ``` doesn't, because firejail uses `chroot`, and the sandboxed process can't create user namespaces anymore. Similarily, chromium without SUID sandbox (which will be removed in the future) doesn't work: ``` firejail --noprofile --overlay-tmpfs chromium --disable-setuid-sandbox ``` Could `pivot_root()` be used instead? Other sandbox programs use it instead of `chroot`, and since it updates current mount namespace root, sandboxed programs can still create their own user namespaces. `pivot_root` requires that at least: * new mount namespace is used (it seems that firejail already uses it) * target root directory is not on the same filesystem (for overlay it's obviously not, while for chroot, target seems to be bind-mounted anyway) Here's a PoC patch I've made to check whether `pivot_root` can work here. While I got `unshare -U` and `chromium --disable-setuid-sandbox` to work with `--overlay-tmpfs`, I have no idea what other side effects this introduces (and whether firejail even works beyond these simple use cases), so I'll just drop this here: ```diff diff --git a/src/firejail/firejail.h b/src/firejail/firejail.h index a6377261..482ffdd5 100644 --- a/src/firejail/firejail.h +++ b/src/firejail/firejail.h @@ -490,6 +490,7 @@ int safe_fd(const char *path, int flags); int invalid_sandbox(const pid_t pid); int has_handler(pid_t pid, int signal); void enter_network_namespace(pid_t pid); +int pivot_root(const char *new_root, const char *put_old); // Get info regarding the last kernel mount operation from /proc/self/mountinfo // The return value points to a static area, and will be overwritten by subsequent calls. diff --git a/src/firejail/fs.c b/src/firejail/fs.c index ce2ca5e2..725243ca 100644 --- a/src/firejail/fs.c +++ b/src/firejail/fs.c @@ -852,6 +852,35 @@ char *fs_check_overlay_dir(const char *subdirname, int allow_reuse) { // # exit // # umount /root/overlay/root +// new_root must not be on the same filesystem as the current root +void change_root_in_ns(const char *new_root) { + int oldroot = open("/", O_DIRECTORY | O_RDONLY); + + if (oldroot < 0) + errExit("open"); + + if (chdir(new_root) < 0) + errExit("chdir"); + + if (pivot_root(".", ".") < 0) + errExit("pivot_root"); + + // must umount the old root mounted on top the new one + if (fchdir(oldroot) < 0) + errExit("fchdir"); + + close(oldroot); + + // the old root must not be shared + if (mount("", ".", "", MS_SLAVE | MS_REC, NULL) < 0) + errExit("mount"); + + if (umount2(".", MNT_DETACH) < 0) + errExit("umount2"); + + if (chdir("/") == -1) + errExit("chdir"); +} // to do: fix the code below; also, it might work without /dev, but consider keeping /dev/shm; add locking mechanism for overlay-clean #include <sys/utsname.h> @@ -1089,8 +1118,7 @@ void fs_overlayfs(void) { #ifdef HAVE_GCOV __gcov_flush(); #endif - if (chroot(oroot) == -1) - errExit("chroot"); + change_root_in_ns(oroot); // update /var directory in order to support multiple sandboxes running on the same root directory // if (!arg_private_dev) @@ -1344,8 +1372,7 @@ void fs_chroot(const char *rootdir) { errExit("mkdir"); if (mount(rootdir, oroot, NULL, MS_BIND|MS_REC, NULL) < 0) errExit("mounting rootdir oroot"); - if (chroot(oroot) < 0) - errExit("chroot"); + change_root_in_ns(oroot); // create all other /run/firejail files and directories preproc_build_firejail_dir(); diff --git a/src/firejail/util.c b/src/firejail/util.c index 4634993d..cfc4db3e 100644 --- a/src/firejail/util.c +++ b/src/firejail/util.c @@ -35,6 +35,8 @@ # define O_PATH 010000000 #endif +#include <syscall.h> + #define MAX_GROUPS 1024 #define MAXBUF 4098 #define EMPTY_STRING ("") @@ -1330,3 +1332,12 @@ void enter_network_namespace(pid_t pid) { exit(1); } } + +int pivot_root(const char *new_root, const char *put_old) { +#ifdef __NR_pivot_root + return syscall (__NR_pivot_root, new_root, put_old); +#else + errno = ENOSYS; + return -1; +#endif +} ```
gitea-mirror added the
enhancement
label 2026-05-05 08:32:08 -06:00
Author
Owner

@netblue30 commented on GitHub (Nov 5, 2019):

Thanks for the patch. I'll grab it after we release the current version. There will be some more work there,

<!-- gh-comment-id:549957657 --> @netblue30 commented on GitHub (Nov 5, 2019): Thanks for the patch. I'll grab it after we release the current version. There will be some more work there,
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference: github-starred/firejail#1869
No description provided.