[PATCH v5 2/9] landlock: Control pathname UNIX domain socket resolution by path

Wed Mar 11 04:46:45 UTC 2026

On Sun, Mar 8, 2026 at 1:18 AM Mickaël Salaün <mic at digikod.net> wrote:
>
> On Fri, Feb 20, 2026 at 03:33:28PM +0100, Günther Noack wrote:
> > +netdev, we could use some advice on the locking approach in af_unix (see below)
> >
> > On Wed, Feb 18, 2026 at 10:37:14AM +0100, Mickaël Salaün wrote:
> > > On Sun, Feb 15, 2026 at 11:51:50AM +0100, Günther Noack wrote:
> > > > diff --git a/include/uapi/linux/landlock.h b/include/uapi/linux/landlock.h
> > > > index f88fa1f68b77..3a8fc3af0d64 100644
> > > > --- a/include/uapi/linux/landlock.h
> > > > +++ b/include/uapi/linux/landlock.h
> > > > @@ -248,6 +248,15 @@ struct landlock_net_port_attr {
> > > >   *
> > > >   *   This access right is available since the fifth version of the Landlock
> > > >   *   ABI.
> > > > + * - %LANDLOCK_ACCESS_FS_RESOLVE_UNIX: Look up pathname UNIX domain sockets
> > > > + *   (:manpage:`unix(7)`).  On UNIX domain sockets, this restricts both calls to
> > > > + *   :manpage:`connect(2)` as well as calls to :manpage:`sendmsg(2)` with an
> > > > + *   explicit recipient address.
> > > > + *
> > > > + *   This access right only applies to connections to UNIX server sockets which
> > > > + *   were created outside of the newly created Landlock domain (e.g. from within
> > > > + *   a parent domain or from an unrestricted process).  Newly created UNIX
> > > > + *   servers within the same Landlock domain continue to be accessible.
> > >
> > > It might help to add a reference to the explicit scope mechanism.
> > >
> > > Please squash patch 9/9 into this one and also add a reference here to
> > > the rationale described in security/landlock.rst
> >
> > Sounds good, will do.
> >
> >
> > > > +static void unmask_scoped_access(const struct landlock_ruleset *const client,
> > > > +                          const struct landlock_ruleset *const server,
> > > > +                          struct layer_access_masks *const masks,
> > > > +                          const access_mask_t access)
> > >
> > > This helper should be moved to task.c and factored out with
> > > domain_is_scoped().  This should be a dedicated patch.
> >
> > (already discussed in another follow-up mail)
> >
> >
> > > > +static int hook_unix_find(const struct path *const path, struct sock *other,
> > > > +                   int flags)
> > > > +{
> > > > + const struct landlock_ruleset *dom_other;
> > > > + const struct landlock_cred_security *subject;
> > > > + struct layer_access_masks layer_masks;
> > > > + struct landlock_request request = {};
> > > > + static const struct access_masks fs_resolve_unix = {
> > > > +         .fs = LANDLOCK_ACCESS_FS_RESOLVE_UNIX,
> > > > + };
> > > > +
> > > > + /* Lookup for the purpose of saving coredumps is OK. */
> > > > + if (unlikely(flags & SOCK_COREDUMP))
> > > > +         return 0;
> > > > +
> > > > + /* Access to the same (or a lower) domain is always allowed. */
> > > > + subject = landlock_get_applicable_subject(current_cred(),
> > > > +                                           fs_resolve_unix, NULL);
> > > > +
> > > > + if (!subject)
> > > > +         return 0;
> > > > +
> > > > + if (!landlock_init_layer_masks(subject->domain, fs_resolve_unix.fs,
> > > > +                                &layer_masks, LANDLOCK_KEY_INODE))
> > > > +         return 0;
> > > > +
> > > > + /* Checks the layers in which we are connecting within the same domain. */
> > > > + dom_other = landlock_cred(other->sk_socket->file->f_cred)->domain;
> > >
> > > We need to call unix_state_lock(other) before reading it, and check for
> > > SOCK_DEAD, and check sk_socket before dereferencing it.  Indeed,
> > > the socket can be make orphan (see unix_dgram_sendmsg and
> > > unix_stream_connect).  I *think* a socket cannot be "resurrected" or
> > > recycled once dead, so we may assume there is no race condition wrt
> > > dom_other, but please double check.  This lockless call should be made
> > > clear in the LSM hook.  It's OK to not lock the socket before
> > > security_unix_find() (1) because no LSM might implement and (2) they
> > > might not need to lock the socket (e.g. if the caller is not sandboxed).
> > >
> > > The updated code should look something like this:
> > >
> > > unix_state_unlock(other);
>
> unix_state_lock(other) of course...
>
> > > if (unlikely(sock_flag(other, SOCK_DEAD) || !other->sk_socket)) {
> > >     unix_state_unlock(other);
> > >     return 0;
> > > }
> > >
> > > dom_other = landlock_cred(other->sk_socket->file->f_cred)->domain;
> > > unix_state_unlock(other);
> >
> > Thank you for spotting the locking concern!
> >
> > @anyone from netdev, could you please advise on the correct locking
> > approach here?
> >
> > * Do we need ot check SOCK_DEAD?

It depends ?  But I don't see your full patch and have no
idea how it will be used.

> >
> >   You are saying that we need to do that, but it's not clear to me
> >   why.
> >
> >   If you look at the places where unix_find_other() is called in
> >   af_unix.c, then you'll find that all of them check for SOCK_DEAD and
> >   then restart from unix_find_other() if they do actually discover
> >   that the socket is dead.

Note all callers of unix_find_other() later locks the other
and double check SOCK_DEAD.

The same check in unix_find_other() is to avoid unnecessary
locking in case the socket is dying.

> >  I think that this is catching this race
> >   condition scenario:
> >
> >   * a server socket exists and is alive
> >   * A client connects: af_unix.c's unix_stream_connect() calls
> >     unix_find_other() and finds the server socket...
> >   * (Concurrently): The server closes the socket and enters
> >     unix_release_sock().  This function:
> >     1. disassociates the server sock from the named socket inode
> >        number in the hash table (=> future unix_find_other() calls
> >        will fail).
> >     2. calls sock_orphan(), which sets SOCK_DEAD.
> >   * ...(client connection resuming): unix_stream_connect() continues,
> >     grabs the unix_state_lock(), which apparently protects everything
> >     here, checks that the socket is not dead - and discovers that it
> >     IS suddenly dead.  This was not supposed to happen.  The code
> >     recovers from that by retrying everything starting with the
> >     unix_find_other() call.  From unix_release_sock(), we know that
> >     the inode is not associated with the sock any more -- so the
> >     unix_find_socket_by_inode() call should be failing on the next
> >     attempt.
> >
> >   (This works with unix_dgram_connect() and unix_dgram_sendmsg() as
> >   well.)
> >
> >   The comments next to the SOCK_DEAD checks are also suggesting this.
> >
> > * What lock to use
> >
> >   I am having trouble reasoning about what lock is used for what in
> >   this code.
>
> It's not clear to me neither, and it looks like it's not consistent
> across protocols.
>
> >
> >   Is it possible that the lock protecting ->sk_socket is the
> >   ->sk_callback_lock, not the unix_state_lock()?

Yes, but unix_state_lock() is better.

BPF SOCKMAP is the major user of sk_callback_lock.

> > The only callers to
> >   sk_set_socket are either sock_orphan/sock_graft (both grabbing
> >   sk_callback_lock), or they are creating new struct sock objects that
> >   they own exclusively, and don't need locks yet.
> >
> >   Admittedly, in af_unix.c, sock_orphan() and sock_graft() only get
> >   called in contexts where the unix_state_lock() is held, so it would
> >   probably work as well to lock that, but it is maybe a more
> >   fine-grained approach to use sk_callback_lock?
> >
> >
> > So... how about a scheme where we only check for ->sk_socket not being
> > NULL:
> >
> > read_lock_bh(&other->sk_callback_lock);
> > struct sock *other_sk = other->sk_socket;
> > if (!other_sk) {
> >       read_unlock_bh(&other->sk_callback_lock);
> >       return 0;
> > }
> > /* XXX double check whether we need a lock here too */
> > struct file *file = other_sk->file;
> > if (!other_file) {
> >       read_unlock_bh(&other->sk_callback_lock);
> >       return 0;
> > }
> > read_unlock_bh(&other->sk_callback_lock);
> >
> > If this fails, that would in my understanding also be because the
> > socket has died after the path lookup.  We'd then return 0 (success),
> > because we know that the surrounding SOCK_DEAD logic will repeat
> > everything starting from the path lookup operation (this time likely
> > failing with ECONNREFUSED, but maybe also with a success, if another
> > server process was quick enough).
> >
> > Does this sound reasonable?
>
> Actually, since commit 983512f3a87f ("net: Drop the lock in
> skb_may_tx_timestamp()"), we can just use RCU + READ_ONCE(sk_socket) +
> READ_ONCE(file).  The socket and file should only be freed after the RCU
> grace periode.  As a safeguard, this commit should be a Depends-on.

Note this commit is for the networking fast path (interrupt context),
where we want to avoid unnecessary locking as much as possible.

AF_UNIX works in the process context only and does not
need to follow the pattern.

>
> However, it is safer to return -ECONNREFULED when sk_socket or file are
> NULL.
>
> I would be good to hear from netdev folks though.
>
> TIL, there is an LSM hook for sock_graft().
>
> > –Günther
> >