[PATCH v2 1/2] landlock: Multithreading support for landlock_restrict_self()
Günther Noack
gnoack at google.com
Thu Nov 27 09:56:31 UTC 2025
Hello!
On Mon, Oct 20, 2025 at 10:12:44PM +0200, Mickaël Salaün wrote:
> On Wed, Oct 01, 2025 at 01:18:06PM +0200, Günther Noack wrote:
> > +/*
> > + * restrict_sibling_threads - enables a Landlock policy for all sibling threads
> > + */
> > +static int restrict_sibling_threads(const struct cred *old_cred,
> > + const struct cred *new_cred)
> > +{
> > + int res;
> > + struct task_struct *thread, *caller;
> > + struct tsync_shared_context shared_ctx;
> > + struct tsync_works works = {};
> > + size_t newly_discovered_threads;
> > + bool found_more_threads;
> > + struct tsync_work *ctx;
> > +
> > + atomic_set(&shared_ctx.preparation_error, 0);
> > + init_completion(&shared_ctx.all_prepared);
> > + init_completion(&shared_ctx.ready_to_commit);
> > + atomic_set(&shared_ctx.num_unfinished, 0);
> > + init_completion(&shared_ctx.all_finished);
> > + shared_ctx.old_cred = old_cred;
> > + shared_ctx.new_cred = new_cred;
> > +
> > + caller = current;
> > +
> > + /*
> > + * We schedule a pseudo-signal task_work for each of the calling task's
> > + * sibling threads. In the task work, each thread:
> > + *
> > + * 1) runs prepare_creds() and writes back the error to
> > + * shared_ctx.preparation_error, if needed.
> > + *
> > + * 2) signals that it's done with prepare_creds() to the calling task.
> > + * (completion "all_prepared").
> > + *
> > + * 3) waits for the completion "ready_to_commit". This is sent by the
> > + * calling task after ensuring that all sibling threads have done
> > + * with the "preparation" stage.
> > + *
> > + * After this barrier is reached, it's safe to read
> > + * shared_ctx.preparation_error.
> > + *
> > + * 4) reads shared_ctx.preparation_error and then either does
> > + * commit_creds() or abort_creds().
> > + *
> > + * 5) signals that it's done altogether (barrier synchronization
> > + * "all_finished")
> > + */
> > + do {
> > + found_more_threads = false;
> > +
> > + /*
> > + * The "all_prepared" barrier is used locally to the inner loop,
> > + * this use of for_each_thread(). We can reset it on each loop
> > + * iteration because all previous loop iterations are done with
> > + * it already.
> > + *
> > + * num_preparing is initialized to 1 so that the counter can not
> > + * go to 0 and mark the completion as done before all task works
> > + * are registered. (We decrement it at the end of this loop.)
> > + */
> > + atomic_set(&shared_ctx.num_preparing, 1);
> > + reinit_completion(&shared_ctx.all_prepared);
> > +
>
> > + /* In RCU read-lock, count the threads we need. */
> > + newly_discovered_threads = 0;
> > + rcu_read_lock();
> > + for_each_thread(caller, thread) {
> > + /* Skip current, since it is initiating the sync. */
> > + if (thread == caller)
> > + continue;
> > +
> > + /* Skip exited threads. */
> > + if (thread->flags & PF_EXITING)
> > + continue;
> > +
> > + /* Skip threads that we have already seen. */
> > + if (tsync_works_contains_task(&works, thread))
> > + continue;
> > +
> > + newly_discovered_threads++;
> > + }
> > + rcu_read_unlock();
>
> This RCU block could be moved in a dedicated helper that will return the
> number of newly discovered threads. In this helper, we could use
> guard()(rcu).
Done.
> > +
> > + if (newly_discovered_threads == 0)
> > + break; /* done */
> > +
> > + res = tsync_works_grow_by(&works, newly_discovered_threads,
> > + GFP_KERNEL_ACCOUNT);
> > + if (res) {
> > + atomic_set(&shared_ctx.preparation_error, res);
> > + break;
> > + }
> > +
> > + rcu_read_lock();
> > + for_each_thread(caller, thread) {
> > + /* Skip current, since it is initiating the sync. */
> > + if (thread == caller)
> > + continue;
> > +
> > + /* Skip exited threads. */
> > + if (thread->flags & PF_EXITING)
> > + continue;
> > +
> > + /* Skip threads that we already looked at. */
> > + if (tsync_works_contains_task(&works, thread))
> > + continue;
> > +
> > + /*
> > + * We found a sibling thread that is not doing its
> > + * task_work yet, and which might spawn new threads
> > + * before our task work runs, so we need at least one
> > + * more round in the outer loop.
> > + */
> > + found_more_threads = true;
> > +
> > + ctx = tsync_works_provide(&works, thread);
> > + if (!ctx) {
> > + /*
> > + * We ran out of preallocated contexts -- we
> > + * need to try again with this thread at a later
> > + * time! found_more_threads is already true
> > + * at this point.
> > + */
> > + break;
> > + }
> > +
> > + ctx->shared_ctx = &shared_ctx;
> > +
> > + atomic_inc(&shared_ctx.num_preparing);
> > + atomic_inc(&shared_ctx.num_unfinished);
> > +
> > + init_task_work(&ctx->work,
> > + restrict_one_thread_callback);
> > + res = task_work_add(thread, &ctx->work, TWA_SIGNAL);
> > + if (res) {
> > + /*
> > + * Remove the task from ctx so that we will
> > + * revisit the task at a later stage, if it
> > + * still exists.
> > + */
> > + put_task_struct_rcu_user(ctx->task);
> > + ctx->task = NULL;
> > +
> > + atomic_set(&shared_ctx.preparation_error, res);
> > + atomic_dec(&shared_ctx.num_preparing);
> > + atomic_dec(&shared_ctx.num_unfinished);
> > + }
> > + }
> > + rcu_read_unlock();
>
> As for the other RCU block, it might help to move this RCU block into a
> dedicated helper. It seems that it would look easier to read and
> maintain.
Done.
> > + /*
> > + * Decrement num_preparing for current, to undo that we
> > + * initialized it to 1 at the beginning of the inner loop.
> > + */
> > + if (atomic_dec_return(&shared_ctx.num_preparing) > 0)
> > + wait_for_completion(&shared_ctx.all_prepared);
> > + } while (found_more_threads &&
> > + !atomic_read(&shared_ctx.preparation_error));
>
> Is it safe to prevent inconsistencies wrt execve? seccomp uses
> cred_guard_mutex (new code should probably use exec_update_lock), why
> should Landlock not do the same?
>
> Why shouldn't we lock sighand as well?
>
> Answers to these questions should be explained in comments.
Added something to the top of the loop, based on Jann's explanation in [1].
[1] https://lore.kernel.org/all/CAG48ez3MxN524ge_sZeTwL0FEDASaSTb-gm1vPO8UwpijTeHqw@mail.gmail.com/
—Günther
More information about the Linux-security-module-archive
mailing list