seccomp: Always install filters for native architecture
The commit 65976868 ("seccomp: don't install filters for archs that can't use syscalls") introduced a regression where filters may not be installed for the "native" architecture. This means that setting SystemCallArchitectures=native for a unit effectively disables the SystemCallFilter= and SystemCallLog= options. Conceptually, we have two filter stages: 1. architecture used for syscall (SystemCallArchitectures=) 2. syscall + architecture combination (SystemCallFilter=) The above commit tried to optimize the filter generation by skipping the second level filtering when it is not required. However, systemd will never fully block the "native" architecture using the first level filter. This makes the code a lot simpler, as systemd can execve() the target binary using its own architecture. And, it should be perfectly fine as the "native" architecture will always be the one with the most restrictive seccomp filtering. Said differently, the bug arises because (on x86_64): 1. x86_64 is permitted by libseccomp already 2. native != x86_64 3. the loop wants to block x86_64 because the permitted set only contains "native" (i.e. "native" != "x86_64") 4. x86_64 is marked as blocked in seccomp_local_archs Thereby we have an inconsistency, where it is marked as blocked in the seccomp_local_archs array but it is allowed by libseccomp. i.e. we will skip generating filter stage 2 without having stage 1 in place. The fix is simple, we just skip the native architecture when looping seccomp_local_archs. This way the inconsistency cannot happen. (cherry picked from commit f833df38) (cherry picked from commit ba8bce7b)
Loading
Please register or sign in to comment