[PATCH v3 0/3] Landlock multithreaded enforcement
Mickaël Salaün
mic at digikod.net
Wed Feb 11 14:55:07 UTC 2026
FYI, syzkaller now supports this new flag, and it has been fuzzed for a
few months (before being merged):
https://github.com/google/syzkaller/commit/e5e258750ba4cad4408ac45a26c0aafff51d45b1
On Thu, Feb 05, 2026 at 07:53:47PM +0100, Mickaël Salaün wrote:
> Good job for writing this complex mechanic (and the related doc), this
> patch series is great! It's been in linux-next for a few weeks and I'll
> take it for Linux 7.0
>
> I did some cosmetic changes though, you'll find them in my commits.
> Some more tests are needed but I'll take this series for now.
>
> Thanks!
>
> On Thu, Nov 27, 2025 at 12:51:33PM +0100, Günther Noack wrote:
> > This patch set adds the LANDLOCK_RESTRICT_SELF_TSYNC flag to
> > landlock_restrict_self(). With this flag, the passed Landlock ruleset
> > will not only be applied to the calling thread, but to all threads
> > which belong to the same process.
> >
> > Motivation
> > ==========
> >
> > TL;DR: The libpsx/nptl(7) signal hack which we use in user space for
> > multi-threaded Landlock enforcement is incompatible with Landlock's
> > signal scoping support. Landlock can restrict the use of signals
> > across Landlock domains, but we need signals ourselves in user space
> > in ways that are not permitted any more under these restrictions.
> >
> > Enabling Landlock proves to be difficult in processes that are already
> > multi-threaded at the time of enforcement:
> >
> > * Enforcement in only one thread is usually a mistake because threads
> > do not normally have proper security boundaries between them.
> >
> > * Also, multithreading is unavoidable in some circumstances, such as
> > when using Landlock from a Go program. Go programs are already
> > multithreaded by the time that they enter the "func main()".
> >
> > So far, the approach in Go[1] was to use libpsx[2]. This library
> > implements the mechanism described in nptl(7) [3]: It keeps track of
> > all threads with a linker hack and then makes all threads do the same
> > syscall by registering a signal handler for them and invoking it.
> >
> > With commit 54a6e6bbf3be ("landlock: Add signal scoping"), Landlock
> > gained the ability to restrict the use of signals across different
> > Landlock domains.
> >
> > Landlock's signal scoping support is incompatible with the libpsx
> > approach of enabling Landlock:
> >
> > (1) With libpsx, although all threads enforce the same ruleset object,
> > they technically do the operation separately and end up in
> > distinct Landlock domains. This breaks signaling across threads
> > when using LANDLOCK_SCOPE_SIGNAL.
> >
> > (2) Cross-thread Signals are themselves needed to enforce further
> > nested Landlock domains across multiple threads. So nested
> > Landlock policies become impossible there.
> >
> > In addition to Landlock itself, cross-thread signals are also needed
> > for other seemingly-harmless API calls like the setuid(2) [4] and for
> > the use of libcap (co-developed with libpsx), which have the same
> > problem where the underlying syscall only applies to the calling
> > thread.
> >
> > Implementation details
> > ======================
> >
> > Enforcement prerequisites
> > -------------------------
> >
> > Normally, the prerequisite for enforcing a Landlock policy is to
> > either have CAP_SYS_ADMIN or the no_new_privs flag. With
> > LANDLOCK_RESTRICT_SELF_TSYNC, the no_new_privs flag will automatically
> > be applied for sibling threads if the caller had it.
> >
> > These prerequisites and the "TSYNC" behavior work the same as for
> > Seccomp and its SECCOMP_FILTER_FLAG_TSYNC flag.
> >
> > Pseudo-signals
> > --------------
> >
> > Landlock domains are stored in struct cred, and a task's struct cred
> > can only be modified by the task itself [6].
> >
> > To make that work, we use task_work_add() to register a pseudo-signal
> > for each of the affected threads. At signal execution time, these
> > tasks will coordinate to switch out their Landlock policy in lockstep
> > with each other, guaranteeing all-or-nothing semantics.
> >
> > This implementation can be thought of as a kernel-side implementation
> > of the userspace hack that glibc/NPTL use for setuid(2) [3] [4], and
> > which libpsx implements for libcap [2].
> >
> > Finding all sibling threads
> > ---------------------------
> >
> > In order to avoid grabbing the global task_list_lock, we employ the
> > scheme proposed by Jann Horn in [7]:
> >
> > 1. Loop through the list of sibling threads
> > 2. Schedule a pseudo-signal for each and make each thread wait in the
> > pseudo-signal
> > 3. Go back to 1. and look for more sibling thread that we have not
> > seen yet
> >
> > Do this until no more new threads are found. As all threads were
> > waiting in their pseudo-signals, they can not spawn additional threads
> > and we found them all.
> >
> > Coordination between tasks
> > --------------------------
> >
> > As tasks run their pseudo-signal task work, they coordinate through
> > the following completions:
> >
> > - all_prepared (with counter num_preparing)
> >
> > When done, all new sibling threads in the inner loop(!) of finding
> > new threads are now in their pseudo-signal handlers and have
> > prepared the struct cred object to commit (or written an error into
> > the shared "preparation_error").
> >
> > The lifetime of all_prepared is only the inner loop of finding new
> > threads.
> >
> > - ready_to_commit
> >
> > When done, the outer loop of finding new threads is done and all
> > sibling threads have prepared their struct cred object. Marked
> > completed by the calling thread.
> >
> > - all_finished
> >
> > When done, all sibling threads are done executing their
> > pseudo-signal handlers.
> >
> > Use of credentials API
> > ----------------------
> >
> > Under normal circumstances, sibling threads share the same struct cred
> > object. To avoid unnecessary duplication, if we find that a thread
> > uses the same struct cred as the calling thread, we side-step the
> > normal use of the credentials API [6] and place a pointer to that
> > existing struct cred instead of creating a new one using
> > prepare_creds() in the sibling thread.
> >
> > Noteworthy discussion points
> > ~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> >
> > * We are side-stepping the normal credentials API [6], by re-wiring an
> > existing struct cred object instead of calling prepare_creds().
> >
> > We can technically avoid it, but it would create unnecessary
> > duplicate struct cred objects in multithreaded scenarios.
> >
> > Change Log
> > ==========
> >
> > v3:
> > - bigger organizational changes
> > - move tsync logic into own file
> > - tsync: extract count_additional_threads() and
> > schedule_task_work()
> > - code style
> > - restrict_one_thread, syscalls.c: use err instead of res (mic)
> > - restrict_one_thread: inline current_cred variable
> > - restrict_one_thread: add comment to shortcut logic (mic)
> > - rsync_works helpers: use size_t i for loop vars
> > - landlock_cred_copy: skip redundant NULL checks
> > - function name: s,tsync_works_free,tsync_works_release, (mic)
> > - tsync_works_grow_by: kzalloc into a temporary variable for
> > clarity (mic)
> > - tsync_works_contains_task: make struct task_works const
> > - bugs
> > - handle kmalloc family failures correctly (jannh)
> > - tsync_works_release: check task NULL ptr before put
> > - s/put_task_struct_rcu_user/put_task_struct/ (jannh)
> > - concurrency bugs
> > - schedule_task_work: do not return error when encountering exiting
> > tasks This can happen during normal operation, we should not
> > error due to it (jannh)
> > - landlock_restrict_sibling_threads: make current hold the
> > num_unfinished/all_finished barrier (more robust, jannh)
> > - un-wedge the deadlock using wait_for_completion_interruptible
> > (jannh) See "testing" below and discussion in
> > https://lore.kernel.org/all/CAG48ez1oS9kANZBq1bt+D76MX03DPHAFp76GJt7z5yx-Na1VLQ@mail.gmail.com/
> > - logic
> > - tsync_works_grow_by(): grow to size+n, not capacity+n
> > - tsync_works_grow_by(): add overflow check for capacity increase
> > - landlock_restrict_self(): make TSYNC and LOG flags work together
> > - set no_new_privs in the same way as seccomp,
> > whenever the calling thread had it
> > - testing
> > - add test where multiple threads call landlock_restrict_self()
> > concurrently
> > - test that no_new_privs is implicitly enabled for sibling threads
> > - bump ABI version to v8
> > - documentation improvements
> > - document ABI v8
> > - move flag documentation into the landlock.h header
> > - comment: Explain why we do not need sighand->siglock or
> > cred_guard_mutex
> > - various comment improvements
> > - reminder above struct landlock_cred_security about updating
> > landlock_cred_copy on changes
> >
> > v2:
> > - https://lore.kernel.org/all/20250221184417.27954-2-gnoack3000@gmail.com/
> > - Semantics:
> > - Threads implicitly set NO_NEW_PRIVS unless they have
> > CAP_SYS_ADMIN, to fulfill Landlock policy enforcement
> > prerequisites
> > - Landlock policy gets unconditionally overridden even if the
> > previously established Landlock domains in sibling threads were
> > diverging.
> > - Restructure discovery of all sibling threads, with the algorithm
> > proposed by Jann Horn [7]: Loop through threads multiple times, and
> > get them all stuck in the pseudo signal (task work), until no new
> > sibling threads show up.
> > - Use RCU lock when iterating over sibling threads.
> > - Override existing Landlock domains of other threads,
> > instead of applying a new Landlock policy on top
> > - Directly re-wire the struct cred for sibling threads,
> > instread of creating a new one with prepare_creds().
> > - Tests:
> > - Remove multi_threaded_failure test
> > (The only remaining failure case is ENOMEM,
> > there is no good way to provoke that in a selftest)
> > - Add test for success despite diverging Landlock domains.
> >
> > [1] https://github.com/landlock-lsm/go-landlock
> > [2] https://sites.google.com/site/fullycapable/who-ordered-libpsx
> > [3] https://man.gnoack.org/7/nptl
> > [4] https://man.gnoack.org/2/setuid#VERSIONS
> > [5] https://lore.kernel.org/all/20240805-remove-cred-transfer-v2-0-a2aa1d45e6b8@google.com/
> > [6] https://www.kernel.org/doc/html/latest/security/credentials.html
> > [7] https://lore.kernel.org/all/CAG48ez0pWg3OTABfCKRk5sWrURM-HdJhQMcWedEppc_z1rrVJw@mail.gmail.com/
> >
> > Günther Noack (3):
> > landlock: Multithreading support for landlock_restrict_self()
> > landlock: selftests for LANDLOCK_RESTRICT_SELF_TSYNC
> > landlock: Document LANDLOCK_RESTRICT_SELF_TSYNC
> >
> > Documentation/userspace-api/landlock.rst | 8 +
> > include/uapi/linux/landlock.h | 13 +
> > security/landlock/Makefile | 2 +-
> > security/landlock/cred.h | 12 +
> > security/landlock/limits.h | 2 +-
> > security/landlock/syscalls.c | 66 ++-
> > security/landlock/tsync.c | 555 ++++++++++++++++++
> > security/landlock/tsync.h | 16 +
> > tools/testing/selftests/landlock/base_test.c | 8 +-
> > tools/testing/selftests/landlock/tsync_test.c | 161 +++++
> > 10 files changed, 810 insertions(+), 33 deletions(-)
> > create mode 100644 security/landlock/tsync.c
> > create mode 100644 security/landlock/tsync.h
> > create mode 100644 tools/testing/selftests/landlock/tsync_test.c
> >
> > --
> > 2.52.0.177.g9f829587af-goog
> >
> >
More information about the Linux-security-module-archive
mailing list