[GH-ISSUE #86] Can't run 32 bit executable on a 64 bit kernel if seccomp filter is enabled #50

Closed
opened 2026-05-05 04:53:54 -06:00 by gitea-mirror · 6 comments
Owner

Originally created by @dzamlo on GitHub (Oct 20, 2015).
Original GitHub issue: https://github.com/netblue30/firejail/issues/86

If you try to run a 32 bit executable on a 64 bit kernel with seccomp filter enabled you get a "Bad system call" message.

Originally created by @dzamlo on GitHub (Oct 20, 2015). Original GitHub issue: https://github.com/netblue30/firejail/issues/86 If you try to run a 32 bit executable on a 64 bit kernel with seccomp filter enabled you get a "Bad system call" message.
gitea-mirror 2026-05-05 04:53:54 -06:00
Author
Owner

@netblue30 commented on GitHub (Oct 20, 2015):

syscall numbers don't match in 32bit and 64bit architectures. For example, syscall 311 is a harmless sys_set_robust_list on 32bit and a troublesome process_vm_writev on 64bit. The kernel seccomp module will shut down the process. There is no way to fix this in user space, a fix in the kernel would be necessary. Affected programs: Wine, Steam.

<!-- gh-comment-id:149596609 --> @netblue30 commented on GitHub (Oct 20, 2015): syscall numbers don't match in 32bit and 64bit architectures. For example, syscall 311 is a harmless sys_set_robust_list on 32bit and a troublesome process_vm_writev on 64bit. The kernel seccomp module will shut down the process. There is no way to fix this in user space, a fix in the kernel would be necessary. Affected programs: Wine, Steam.
Author
Owner

@dzamlo commented on GitHub (Oct 21, 2015):

I'm not really familiar with bpf/seccomp/syscall so maybe this is wrong,
but in the seccomp-bpf filter you can check the architecture and filter different syscall number depending on the architecture.

The diffuculty whould be getting all the syscall number for both architecture bot this as already be done in the libseccomp project (https://github.com/seccomp/libseccomp) (see there scmp_sys_resolver tool for example)

Here is a quick and dirty example that filter the nanosleep call on both 32 and 64 bits without filtering other syscall:
https://gist.github.com/dzamlo/1ca206e4664a2a845886

<!-- gh-comment-id:150030948 --> @dzamlo commented on GitHub (Oct 21, 2015): I'm not really familiar with bpf/seccomp/syscall so maybe this is wrong, but in the seccomp-bpf filter you can check the architecture and filter different syscall number depending on the architecture. The diffuculty whould be getting all the syscall number for both architecture bot this as already be done in the libseccomp project (https://github.com/seccomp/libseccomp) (see there scmp_sys_resolver tool for example) Here is a quick and dirty example that filter the nanosleep call on both 32 and 64 bits without filtering other syscall: https://gist.github.com/dzamlo/1ca206e4664a2a845886
Author
Owner

@netblue30 commented on GitHub (Oct 22, 2015):

Thank you for the code example. I am merging the text from https://github.com/netblue30/firejail/issues/87 here:

If your kernel support x32 executable (the CONFIG_X86_X32=y option), you can use them to bypass Blacklist based seccomp filter. x32 syscall are made with the same arch value as x86_64 but use different syscall number. This mean than the VALIDATE_ARCHITECTURE test don't reject x32 executable.

All syscall number from x32 executable have the bit 30 set to 1. You can check if the syscall number is bigger than 0x40000000 and reject the syscall if this the case.

I think we can do it, but we have to be careful: some people use it on architectures such as arm or mips. Maybe we can support this only on amd64.

<!-- gh-comment-id:150229095 --> @netblue30 commented on GitHub (Oct 22, 2015): Thank you for the code example. I am merging the text from https://github.com/netblue30/firejail/issues/87 here: > If your kernel support x32 executable (the CONFIG_X86_X32=y option), you can use them to bypass Blacklist based seccomp filter. x32 syscall are made with the same arch value as x86_64 but use different syscall number. This mean than the VALIDATE_ARCHITECTURE test don't reject x32 executable. > > All syscall number from x32 executable have the bit 30 set to 1. You can check if the syscall number is bigger than 0x40000000 and reject the syscall if this the case. I think we can do it, but we have to be careful: some people use it on architectures such as arm or mips. Maybe we can support this only on amd64.
Author
Owner

@dzamlo commented on GitHub (Oct 23, 2015):

I have added in the gist an example using libseccomp, a library which seem to handle all that (including x32).
I think using it is the way forward. Or at least taking some inspiration.

If we choose not to use libseccomp, I think supporting this only on amd64 is the more pragmatic choice, at least initially.

<!-- gh-comment-id:150674488 --> @dzamlo commented on GitHub (Oct 23, 2015): I have added in the gist an example using libseccomp, a library which seem to handle all that (including x32). I think using it is the way forward. Or at least taking some inspiration. If we choose not to use libseccomp, I think supporting this only on amd64 is the more pragmatic choice, at least initially.
Author
Owner

@netblue30 commented on GitHub (Oct 24, 2015):

Firejail is a SUID program, I cannot link to any external libraries!

I'll try something else. Seccomp allows us to chain multiple filters. With some modifications to VALIDATE_ARCHITECTURE, we chain two blacklist filters: the regular one for amd64 and a new one for i386. We do this only for amd64 compilations. For i386 filter we hardcode the syscall values.

<!-- gh-comment-id:150811888 --> @netblue30 commented on GitHub (Oct 24, 2015): Firejail is a SUID program, I cannot link to any external libraries! I'll try something else. Seccomp allows us to chain multiple filters. With some modifications to VALIDATE_ARCHITECTURE, we chain two blacklist filters: the regular one for amd64 and a new one for i386. We do this only for amd64 compilations. For i386 filter we hardcode the syscall values.
Author
Owner

@netblue30 commented on GitHub (Oct 29, 2015):

Fixed! I have a dual i386/amd64 filter running when --seccomp is enabled.

<!-- gh-comment-id:152153333 --> @netblue30 commented on GitHub (Oct 29, 2015): Fixed! I have a dual i386/amd64 filter running when --seccomp is enabled.
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference: github-starred/firejail#50
No description provided.