[syzbot] [kernel?] INFO: task hung in restrict_one_thread_callback

Tue Feb 24 03:02:44 UTC 2026

Hi Günther,

Thank you for the detailed analysis! I completely agree that serializing the TSYNC 
operations is the right way to prevent this deadlock. I have drafted a patch using 
`exec_update_lock` (similar to how seccomp uses `cred_guard_mutex`).

Regarding your proposal to split this into two patches (one for the cleanup 
path and one for the lock): Maybe combining them into a single patch is a better choice. Here is why:

We actually *cannot* remove `wait_for_completion(&shared_ctx.all_prepared)` 
in the interrupt recovery path. Since `shared_ctx` is allocated on the local 
stack of the caller, removing the wait would cause a severe Use-After-Free (UAF) if the 
thread returns to userspace while sibling task_works are still executing and dereferencing `ctx`. 

By adding the lock, we inherently resolve the deadlock, meaning the sibling task_works 
will never get stuck. Thus, `wait_for_completion` becomes perfectly safe to keep, 
and it remains strictly necessary to protect the stack memory. Therefore, the "fix" for the 
cleanup path is simply updating the comments to reflect this reality, which is tightly coupled with the locking fix. 
It felt more cohesive as a single patch.

I have test the patch on my laptop,and it will not trigger the issue.Let's have syzbot test this combined logic:

#syz test:

--- a/security/landlock/tsync.c

+++ b/security/landlock/tsync.c

@@ -447,6 +447,12 @@ int landlock_restrict_sibling_threads(const struct cred *old_cred,

        shared_ctx.new_cred = new_cred;

        shared_ctx.set_no_new_privs = task_no_new_privs(current);

 

+       /*

+        * Serialize concurrent TSYNC operations to prevent deadlocks

+        * when multiple threads call landlock_restrict_self() simultaneously.

+        */

+       down_write(&current->signal->exec_update_lock);

+

        /*

         * We schedule a pseudo-signal task_work for each of the calling task's

         * sibling threads.  In the task work, each thread:

@@ -527,14 +533,17 @@ int landlock_restrict_sibling_threads(const struct cred *old_cred,

                                           -ERESTARTNOINTR);

 

                                /*

-                                * Cancel task works for tasks that did not start running yet,

-                                * and decrement all_prepared and num_unfinished accordingly.

+                                * Opportunistic improvement: try to cancel task works

+                                * for tasks that did not start running yet. We do not

+                                * have a guarantee that it cancels any of the enqueued

+                                * task works (because task_work_run() might already have

+                                * dequeued them).

                                 */

                                cancel_tsync_works(&works, &shared_ctx);

 

                                /*

-                                * The remaining task works have started running, so waiting for

-                                * their completion will finish.

+                                * We must wait for the remaining task works to finish to

+                                * prevent a use-after-free of the local shared_ctx.

                                 */

                                wait_for_completion(&shared_ctx.all_prepared);

                        }

@@ -557,5 +566,7 @@ int landlock_restrict_sibling_threads(const struct cred *old_cred,

 

        tsync_works_release(&works);

 

+       up_write(&current->signal->exec_update_lock);

+

        return atomic_read(&shared_ctx.preparation_error);

 }

--
在 2026/2/23 23:16, Günther Noack 写道:
> Hello!
> 
> On Mon, Feb 23, 2026 at 07:29:56PM +0800, Ding Yihan wrote:
>> Thank you for the detailed analysis and the clear breakdown. 
>> Apologies for the delayed response. I spent the last couple of days
>> thoroughly reading through the previous mailing list discussions. I
>> was trying hard to see if there was any viable pure lockless design
>> that could solve this concurrency issue while preserving the original
>> architecture. 
>> 
>> However, after looking at the complexities you outlined, I completely
>> agree with your conclusion: serializing the TSYNC operations is indeed
>> the most robust and reasonable path forward to prevent the deadlock.
>> 
>> Regarding the lock choice, since 'cred_guard_mutex' is explicitly
>> marked as deprecated for new code in the kernel,maybe we can use its
>> modern replacement: 'exec_update_lock' (using down_write_trylock /
>> up_write on current->signal). This aligns with the current subsystem
>> standards and was also briefly touched upon by Jann in the older
>> discussions.
>> 
>> I fully understand the requirement for the two-part patch series:
>> 1. Cleaning up the cancellation logic and comments.
>> 2. Introducing the serialization lock for TSYNC.
>> 
>> I will take some time to draft and test this patch series properly. 
>> I also plan to discuss this with my kernel colleagues here at 
>> UnionTech to see if they have any additional suggestions on the 
>> implementation details before I submit it.
>> 
>> I will send out the v1 patch series to the list as soon as it is
>> ready. Thanks again for your guidance and the great discussion!
> 
> Thank you, Ding, this is much appreciated!
> 
> I agree, the `exec_update_lock` might be the better solution;
> I also need to familiarize myself more with it to double-check.
> 
> —Günther
>