[PATCH v2 1/2] landlock: Multithreading support for landlock_restrict_self()

Thu Nov 27 09:56:31 UTC 2025

Hello!

On Mon, Oct 20, 2025 at 10:12:44PM +0200, Mickaël Salaün wrote:
> On Wed, Oct 01, 2025 at 01:18:06PM +0200, Günther Noack wrote:
> > +/*
> > + * restrict_sibling_threads - enables a Landlock policy for all sibling threads
> > + */
> > +static int restrict_sibling_threads(const struct cred *old_cred,
> > +				    const struct cred *new_cred)
> > +{
> > +	int res;
> > +	struct task_struct *thread, *caller;
> > +	struct tsync_shared_context shared_ctx;
> > +	struct tsync_works works = {};
> > +	size_t newly_discovered_threads;
> > +	bool found_more_threads;
> > +	struct tsync_work *ctx;
> > +
> > +	atomic_set(&shared_ctx.preparation_error, 0);
> > +	init_completion(&shared_ctx.all_prepared);
> > +	init_completion(&shared_ctx.ready_to_commit);
> > +	atomic_set(&shared_ctx.num_unfinished, 0);
> > +	init_completion(&shared_ctx.all_finished);
> > +	shared_ctx.old_cred = old_cred;
> > +	shared_ctx.new_cred = new_cred;
> > +
> > +	caller = current;
> > +
> > +	/*
> > +	 * We schedule a pseudo-signal task_work for each of the calling task's
> > +	 * sibling threads.  In the task work, each thread:
> > +	 *
> > +	 * 1) runs prepare_creds() and writes back the error to
> > +	 *    shared_ctx.preparation_error, if needed.
> > +	 *
> > +	 * 2) signals that it's done with prepare_creds() to the calling task.
> > +	 *    (completion "all_prepared").
> > +	 *
> > +	 * 3) waits for the completion "ready_to_commit".  This is sent by the
> > +	 *    calling task after ensuring that all sibling threads have done
> > +	 *    with the "preparation" stage.
> > +	 *
> > +	 *    After this barrier is reached, it's safe to read
> > +	 *    shared_ctx.preparation_error.
> > +	 *
> > +	 * 4) reads shared_ctx.preparation_error and then either does
> > +	 *    commit_creds() or abort_creds().
> > +	 *
> > +	 * 5) signals that it's done altogether (barrier synchronization
> > +	 *    "all_finished")
> > +	 */
> > +	do {
> > +		found_more_threads = false;
> > +
> > +		/*
> > +		 * The "all_prepared" barrier is used locally to the inner loop,
> > +		 * this use of for_each_thread().  We can reset it on each loop
> > +		 * iteration because all previous loop iterations are done with
> > +		 * it already.
> > +		 *
> > +		 * num_preparing is initialized to 1 so that the counter can not
> > +		 * go to 0 and mark the completion as done before all task works
> > +		 * are registered.  (We decrement it at the end of this loop.)
> > +		 */
> > +		atomic_set(&shared_ctx.num_preparing, 1);
> > +		reinit_completion(&shared_ctx.all_prepared);
> > +
> 
> > +		/* In RCU read-lock, count the threads we need. */
> > +		newly_discovered_threads = 0;
> > +		rcu_read_lock();
> > +		for_each_thread(caller, thread) {
> > +			/* Skip current, since it is initiating the sync. */
> > +			if (thread == caller)
> > +				continue;
> > +
> > +			/* Skip exited threads. */
> > +			if (thread->flags & PF_EXITING)
> > +				continue;
> > +
> > +			/* Skip threads that we have already seen. */
> > +			if (tsync_works_contains_task(&works, thread))
> > +				continue;
> > +
> > +			newly_discovered_threads++;
> > +		}
> > +		rcu_read_unlock();
> 
> This RCU block could be moved in a dedicated helper that will return the
> number of newly discovered threads.  In this helper, we could use
> guard()(rcu).

Done.

> > +
> > +		if (newly_discovered_threads == 0)
> > +			break; /* done */
> > +
> > +		res = tsync_works_grow_by(&works, newly_discovered_threads,
> > +					  GFP_KERNEL_ACCOUNT);
> > +		if (res) {
> > +			atomic_set(&shared_ctx.preparation_error, res);
> > +			break;
> > +		}
> > +
> > +		rcu_read_lock();
> > +		for_each_thread(caller, thread) {
> > +			/* Skip current, since it is initiating the sync. */
> > +			if (thread == caller)
> > +				continue;
> > +
> > +			/* Skip exited threads. */
> > +			if (thread->flags & PF_EXITING)
> > +				continue;
> > +
> > +			/* Skip threads that we already looked at. */
> > +			if (tsync_works_contains_task(&works, thread))
> > +				continue;
> > +
> > +			/*
> > +			 * We found a sibling thread that is not doing its
> > +			 * task_work yet, and which might spawn new threads
> > +			 * before our task work runs, so we need at least one
> > +			 * more round in the outer loop.
> > +			 */
> > +			found_more_threads = true;
> > +
> > +			ctx = tsync_works_provide(&works, thread);
> > +			if (!ctx) {
> > +				/*
> > +				 * We ran out of preallocated contexts -- we
> > +				 * need to try again with this thread at a later
> > +				 * time!  found_more_threads is already true
> > +				 * at this point.
> > +				 */
> > +				break;
> > +			}
> > +
> > +			ctx->shared_ctx = &shared_ctx;
> > +
> > +			atomic_inc(&shared_ctx.num_preparing);
> > +			atomic_inc(&shared_ctx.num_unfinished);
> > +
> > +			init_task_work(&ctx->work,
> > +				       restrict_one_thread_callback);
> > +			res = task_work_add(thread, &ctx->work, TWA_SIGNAL);
> > +			if (res) {
> > +				/*
> > +				 * Remove the task from ctx so that we will
> > +				 * revisit the task at a later stage, if it
> > +				 * still exists.
> > +				 */
> > +				put_task_struct_rcu_user(ctx->task);
> > +				ctx->task = NULL;
> > +
> > +				atomic_set(&shared_ctx.preparation_error, res);
> > +				atomic_dec(&shared_ctx.num_preparing);
> > +				atomic_dec(&shared_ctx.num_unfinished);
> > +			}
> > +		}
> > +		rcu_read_unlock();
> 
> As for the other RCU block, it might help to move this RCU block into a
> dedicated helper.  It seems that it would look easier to read and
> maintain.

Done.

> > +		/*
> > +		 * Decrement num_preparing for current, to undo that we
> > +		 * initialized it to 1 at the beginning of the inner loop.
> > +		 */
> > +		if (atomic_dec_return(&shared_ctx.num_preparing) > 0)
> > +			wait_for_completion(&shared_ctx.all_prepared);
> > +	} while (found_more_threads &&
> > +		 !atomic_read(&shared_ctx.preparation_error));
> 
> Is it safe to prevent inconsistencies wrt execve?  seccomp uses
> cred_guard_mutex (new code should probably use exec_update_lock), why
> should Landlock not do the same?
> 
> Why shouldn't we lock sighand as well?
> 
> Answers to these questions should be explained in comments.

Added something to the top of the loop, based on Jann's explanation in [1].

[1] https://lore.kernel.org/all/CAG48ez3MxN524ge_sZeTwL0FEDASaSTb-gm1vPO8UwpijTeHqw@mail.gmail.com/

—Günther