LSM namespacing API

Mon Mar 30 00:56:37 UTC 2026

On Sun, Mar 29, 2026 at 12:09 PM Dr. Greg <greg at enjellic.com> wrote:
> On Tue, Mar 24, 2026 at 05:31:09PM -0400, Paul Moore wrote:
> > On Tue, Mar 3, 2026 at 11:46???AM Paul Moore <paul at paul-moore.com> wrote:
> > >
> > > I'd really like to hear from some of the other LSMs before we start
> > > diving into the code.  It may sound funny, but from my perspective
> > > doing the work to get the API definition "right" is far more important
> > > than implementing it.
>
> > It's been three weeks now, and I haven't seen any strong arguments for
> > supporting the clone() API at this time, so we can leave that out for
> > now and stick with just the unshare() API for an initial attempt.  We
> > can always add a clone() API at a later date if needed; going small
> > and expanding over time is usually a better decision anyway.
> >
> > So to quickly summarize, here is where I think the discussion landed:
> >
> > * Implement the lsm_unshare() syscall
> >
> > I expect it would look something like 'lsm_unshare(struct lsm_ctx
> > *ctx, u32 size, u32 flags)' with @ctx specifying the particular LSM
> > being unshared, and @flags being 0/unused at this point in time
> > (unless we can think of something we want to specify here).  Like
> > lsm_set_self_attr(), only one @ctx can be specified at a time, so you
> > can only unshare one LSM at a time.
>
> Unless we miss something, it would seem that there needs to be
> additional thought as to how a process moves, atomically, from one
> effective security configuration to the next.
>
> At a minimum, if we restrict ourselves to the model of simply changing
> the namespace for a single LSM, there would seem to be a need to have
> a 2-step process in order to atomically transition from one security
> model/policy to the next.

That depends on the individual LSMs, they are free to interpret the
unshare request and handle it however they like.

> The interim between the first and second steps would allow an
> orchestrator to configure the new namespace and load new namespace
> specific policy into the security namespace ...

As discussed previously, the LSM policy load syscalls might include
some LSM namespace options. However, I first want to focus on
finalizing the most basic namespace API, which on Linux is arguably
the unshare() syscall concept.

> It would seem that the flags variable might be a good option to use to
> handle this 2-stage transition, for example LSM_NS_INIT and
> LSM_NS_CHANGE, respectively, to specify the initialization and
> execution phases of the transition.

No.  The lsm_unshare() syscall is intended to mimic the existing
unshare() syscall as a single step process from a user's perspective.
If it returns successfully the caller will be in a new LSM namespace
as defined by the individual LSM specified in the syscall.

> The other unanswered issue, or perhaps we missed it, are the security
> controls that should be associated with the unshare call.

Each LSM is free to implement whatever access controls it deems
necessary in its lsm_unshare() callback.

> Will there be a new LSM hook that allows other LSM's to veto the
> creation of a namespace either for itself or for another LSM?

I would expect the lsm_unshare() syscall to operate similarly to the
lsm_set_self_attr() syscall in this regard.

> Is there a need to have yet another kernel command-line parameter that
> would completely deny the ability to create security namespaces?

No, at least not at this point in time.

Individual LSMs can decide how they want to gate their own namespace
functionality, if they implement namespaces at all.

> Is CAP_MAC_ADMIN appropriate as the required capability to create a
> new namespace or does there need to be, for security rigor, a specific
> capability (CAP_LSM_NS?) that gates the ability to execute whatever
> form of the system call is adopted?

Once again, this is up to the individual LSMs, not the framework layer.

> Should there be an option to completely compile LSM namespaces out of
> the kernel?

That doesn't belong in the LSM framework layer, that is up to the
individual LSMs.

> > * Implement /proc/pid/ns/lsm and setns(CLONE_NEWLSM)
> >
> > As discussed previously, this allows us to move a process into an
> > existing, established LSM namespace set.  The caller cannot
> > selectively choose which individual LSM namespaces they join from the
> > given LSM namespace set, they receive the same LSM namespace
> > configuration as the target process.
>
> As an initial aside.  It would be assumed that a positive result of a
> setns call would be to cause the calling process to atomically change
> its security namespace set.  This would further suggest the need to
> have the security namespace creation process also execute atomically
> in a multi-LSM namespace change environment.

In the setns case no new LSM namespaces should be created, the process
simply joins an existing set of LSM namespaces.

> ... That is the concept of whether or not a setns
> call, for any resource namespace, should also force a security
> namespace change if the security namespace of the calling process
> differs from that of the target process.

That decision is left to the individual LSMs.

-- 
paul-moore.com