LSM namespacing API

Thu Aug 21 02:05:12 UTC 2025

On Tue, Aug 19, 2025 at 02:51:00PM -0400, Paul Moore wrote:
> On Tue, Aug 19, 2025 at 1:47 PM Stephen Smalley
> <stephen.smalley.work at gmail.com> wrote:
> >
> > I think we want to be able to unshare a specific security module
> > namespace without unsharing the others, i.e. just SELinux or just
> > AppArmor.
> > Not sure if your suggestion above supports that already but wanted to note it.
> 
> The lsm_set_self_attr(2) approach allows for LSM specific unshare
> operations.  Take the existing LSM_ATTR_EXEC attribute as an example,
> two LSMs have implemented support (AppArmor and SELinux), and
> userspace can independently set the attribute as desired for each LSM.

Overall I really like the idea.

> > Serge pointed out that we also will need an API to attach to an
> > existing SELinux namespace, which I captured here:
> > https://github.com/stephensmalley/selinuxns/issues/19
> > This is handled for other Linux namespaces by opening a pseudo file
> > under /proc/pid/ns and invoking setns(2), so not sure how we want to
> > do it.
> 
> One option would be to have a the LSM framework return a LSM namespace
> "handle" for a given LSM using lsm_get_self_attr(2) and then do a
> setns(2)-esque operation using lsm_set_self_attr(2) with that
> "handle".  We would need to figure out what would constitute a
> "handle" but let's just mark that as TBD for now with this approach (I
> think better options are available).

The use case which would be complicated (not blocked) by this, is

* a runtime creates a process p1
  * p1 unshares its lsm namespace
* runtime forks a debug/admin process p2
  * p2 wants to enter p1's namespace

Of course the runtime could work around it by, before relinquishing
control of p1 to a new executable, returning the lsm_get_self_attr()
data to over a pipe.

Note I don't think we should support setting another task's namespace,
only getting its namespace ID.

> Since we have an existing LSM namespace combination, with processes
> running inside of it, it might be sufficient to simply support moving
> into an existing LSM namespace set with setns(2) using only a pidfd
> and a new CLONE_LSMNS flag (or similar, upstream might want this as
> CLONE_NEWLSM).  This would simply set the LSM namespace set for the
> setns(2) caller to match that of the target pidfd.  We still wouldn't
> want to support CLONE_LSMNS/CLONE_NEWLSM for clone*().

A part of me is telling (another part of) me that being able to setns
to a subset of the lsms could lead to privilege escapes through
weird policy configurations for the various LSMs.  In which case,
an all-or-nothing LSM setns might actually be preferable.

I haven't thought of a concrete example, though.

> Any other ideas?
> 
> -- 
> paul-moore.com