[syzbot] [kernel?] INFO: task hung in restrict_one_thread_callback

Sat Feb 21 12:00:03 UTC 2026

Hello Ding!

On Sat, Feb 21, 2026 at 03:28:47PM +0800, Ding Yihan wrote:
> Since I am relatively new to the inner workings of this specific subsystem, 
> I would like to take a few days to thoroughly study the root cause 
> (the task_work and mutex interaction) and prepare a detailed and proper commit message. 
> 
> I will send out the formal patch (v1) to the mailing list later.

Thank you very much for preparing a patch, and especially also for
forwarding this to us.  (The original syzkaller report was somehow not
addressed to Landlock or the LSM list.  We should fix that.)

Timing wise, the feature was picked up for the 7.0 release, so we
still have some time to fix it before this is stable.

As an early review for the patch:

Background:

We had previously convinced ourselves that grabbing the
cred_guard_mutex was not necessary.  To quote the comment in
landlock_restrict_sibling_threads():

    Unlike seccomp, which modifies sibling tasks directly, we do not need to
    acquire the cred_guard_mutex and sighand->siglock:

    - As in our case, all threads are themselves exchanging their own struct
      cred through the credentials API, no locks are needed for that.
    - Our for_each_thread() loops are protected by RCU.
    - We do not acquire a lock to keep the list of sibling threads stable
      between our for_each_thread loops.  If the list of available sibling
      threads changes between these for_each_thread loops, we make up for
      that by continuing to look for threads until they are all discovered
      and have entered their task_work, where they are unable to spawn new
      threads.

The question of locking cred_guard_mutex came up in the patch
discussion multiple times as well, the most recent discussion was:
https://lore.kernel.org/all/20251020.fohbo6Iecahz@digikod.net/

If it helps, I keep some of my own notes for this particular feature
on https://wiki.gnoack.org/LandlockMultithreadedEnforcement.

(Very) tentative investigation:

In the Syzkaller report [2], it seems that the reproducer [2.1] is
creating two rulesets and then enforcing them in parallel, a scenario
which we are exercising in the TEST(competing_enablement) in
tools/testing/selftests/landlock/tsync_test.c already, but which has
not failed in my own selftest runs.

In the crash report, there are four threads in total:

* Two are stuck in the line
  wait_for_completion(&ctx->ready_to_commit);
  in the per-thread task work (line 128 [4.1])
* Two are stuck in the line
  wait_for_completion(&shared_ctx.all_prepared)
  in the calling thread's coordination logic (line 539 [4.2])

In line 539, we are already on the code path where we detected that we
are getting interrupted by another thread and where we are attempting
to deal with the scenario where two landlock_restrict_self() calls
compete.  This is detected on line 523 when
wait_for_completion_interruptible() is true.  The approach to handle
this is to set the overall -ERESTARTNOINTR error and cancel the work
that has been ongoing so far, by canceling the task works that did not
start running yet and waiting for the ones that did start running
(that is the step where we are blocked!).  The reasoning there was
that these task works will all hit the "all_prepared" stage now, but
as we can see in the stack trace, the task works that are actively
running are already on line 128 and have passed the "all_prepared"
stage).

Differences I can see between syzkaller and our own test:

* The reproducer also calls openat() and then twice socketpair().
  These syscalls should be unrelated, but it's possible that the
  "async" invocation of socketpair() contributes to adding more
  threads. (Assuming that "async" means "in new thread" in syzkaller)
* Syzkaller gives it more attempts. ([2.2])

I do not understand yet what went wrong in our scheme and need to look
deeper.

Ding, do you have more insights into it from your debugging?

Thanks,
–Günther

For reference:

[1] Report Mail: https://lore.kernel.org/all/69984159.050a0220.21cd75.01bb.GAE@google.com/
[2] Report: https://syzkaller.appspot.com/bug?extid=7ea2f5e9dfd468201817
  [2.1] Reproducer: https://syzkaller.appspot.com/text?tag=ReproSyz&x=16e41c02580000
  [2.2] Reproducer (C): https://syzkaller.appspot.com/text?tag=ReproC&x=15813652580000
[3] Patch: https://lore.kernel.org/all/6999504d.a70a0220.2c38d7.0154.GAE@google.com/
[4.1] https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/tree/security/landlock/tsync.c?id=635c467cc14ebdffab3f77610217c1dacaf88e8c#n128
[4.2] https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/tree/security/landlock/tsync.c?id=635c467cc14ebdffab3f77610217c1dacaf88e8c#n539