LSM namespacing API

Sat Aug 23 23:00:08 UTC 2025

On 8/23/25 10:41, Dr. Greg wrote:
> On Fri, Aug 22, 2025 at 12:59:29PM -0700, John Johansen wrote:
> 
> Good morning, I hope the weekend is going well for everyone.
> 
>> On 8/22/25 07:47, Casey Schaufler wrote:
>>> On 8/21/2025 7:14 PM, Paul Moore wrote:
>>>> On Thu, Aug 21, 2025 at 6:00???AM Micka??l Sala??n <mic at digikod.net>
>>>> wrote:
>>>>> On Tue, Aug 19, 2025 at 02:40:52PM -0400, Paul Moore wrote:
>>>>>> On Tue, Aug 19, 2025 at 1:11???PM Casey Schaufler
>>>>>> <casey at schaufler-ca.com> wrote:
>>>>>>> The advantage of a clone flag is that the operation is atomic with
>>>>>>> the other namespace flag based behaviors. Having a two step process
>>>>>>>
>>>>>>>          clone(); lsm_set_self_attr(); - or -
>>>>>>>          lsm_set_self_attr(); clone();
>>>>>>>
>>>>>>> is going to lead to cases where neither order really works correctly.
>>>>>> I was envisioning something that works similarly to LSM_ATTR_EXEC
>>>>>> where the unshare isn't immediate, but rather happens at a future
>>>>>> event.  With LSM_ATTR_EXEC it happens at the next exec*(), with
>>>>>> LSM_ATTR_UNSHARE I imagine it would happen at the next clone*().
>>>>> The next unshare(2) would make more sense to me.
>>>> That's definitely something to discuss.  I've been fairly loose on
>>>> that in the discussion thus far, but as things are starting to settle
>>>> on the lsm_set_self_attr(2) approach as one API, we should start to
>>>> clarify that.
>>>>
>>>>> This deferred operation could be requested with a flag in
>>>>> lsm_config_system_policy(2) instead:
>>>>> https://lore.kernel.org/r/20250709080220.110947-1-maxime.belair@canonical.com
>>>> I want to keep the policy syscall work separate from the LSM namespace
>>>> discussion as we don't want to require a policy load operation to
>>>> create a new LSM namespace.  I think it's probably okay if the policy
>>>> syscall work were to be namespace aware so that an orchestrator could
>>>> load a LSM policy into a LSM namespace other than it's own, but that
>>>> is still not overly dependent on what we are discussing here (yes,
>>>> maybe it is a little, but only just so).
>>>
>>> Policy load and namespace manipulation *must* be kept separate. Smack
>>> requires the ability to "load policy" at any time. Smack allows a process
>>> to add "policy" to further restrict its own access, and does not require
>>> a namespace change. There has been an implementation of namespaces for
>>> Smack, but the developers disappeared quietly and sadly no one picked it
>>> up. Introducing a requirement that LSMs support namespaces in order to
>>> load policy beyond system initialization is a non-starter.
> 
>> yes the ability to load policy must be exist separately, however
>> policy load could be made namespace aware so that a parent could
>> inject policy into a child.
> 
> Policy or model load, specific to the subordinate namespace, will be
> a necessity.
> 
> As Casey noted, some LSM namespaces will require configuration and
> management calls well after the namespace has started.  Other LSM's
> will want the configuration to be completed before the namespace
> starts, with any further configurations to the namespace blocked.
> 
> There is a very valid security rationale for isolating the capability
> for namespace separation from the capability that allows the
> configuration of a security model.  It would be an entirely realistic
> security objective for a namespace to block further separation
> attempts, while still allowing for management operations to be
> conducted in the context of the subordinate namespace.
> 
> Hence the rationale for splitting CAP_MAC_ADMIN from whatever name the
> bike shedding process around the new capability naming process
> produces.
> 
>> There is also an open question as to whether we need to allow, but
>> not require, some kind of policy manipulation/injection with the
>> creation of the LSM namespace so that the there is an atomic
>> transition with entering the namespace. Is there a case where policy
>> really needs to be present atomically with the creation of the
>> namespace? If so we need to further break it down to
>>
>> 1. is it sufficient for the LSM to do it, without container manager
>> guidance?  An inherit of policy, or already present policy that can be
>> injected. Then we don't need policy load inject to be considered at
>> the point of clone/unshare.
>>
>> 2. do we need to let the container manager hint/load policy.
> 
> Policy load needs to be atomic with respect to namespace separation.
> In other words, the policy needs to be in place when execution within
> the context of the new security namespace begins.
> 
no, it _may_ need to be depending on the model/policy being used, and
an LSM is in the best place to make that decision and do it for its own
policy as long as the infrastructure supports it.

> A resource orchestrator will need the ability to load the new policy
> that will be enforced into the context of the new namespace.

No an LSM is fully capable of doing this and honestly in a better
position to do so for its own policy than an external orchestrator.
Where coordination orchestration is need is at the infrastructure layer
(LSM), to ensure once everything is decided by inidivual LSMs that
what the security context is atomically setup correctly.

So in that sense the LSM infrastructure is an orchestrator, but only
in the loosest sense.

> 
> In the case of some model/integrity based LSM's, the security events
> related to the policy load need to occur in the context of the parent
> LSM namespace.
> 
yes, it very much depends on the model. I would argue if the LSM needs
this.
1. the policy at the exec/fork/clone/unshare point already needs to
    be loaded.
2. the LSMs policy needs a way to initiate the transition. Eg. in
    the selinux case, the transition is setting up a new layer in
    mediation that will be bounded by the previous layers. There isn't
    a transition from one policy to another, but adding a new layer
    on top of.

> See the writings of Werner Karl Heisenberg for the reasoning behind
> that... :-)
> 
>> So far I think the inherit/policy directed injection works for
>> apparmor, and selinux. Container managers generally speaking have to
>> additional setup after the container is created before running the
>> work load, which means a separate load phase should be fine.
>>
>> However I can see an argument for having policy in place when
>> clone/unshare exit. Admittedly atm its largely around flexibility, and
>> nebulous ill defined use cases. Just because something works for
>> apparmor, selinux, and I think smack, doesn't mean it would work for
>> all use cases.
>>
>> But we also should add flexibility for flexibility just because we can
>> see there might be some future utility for some future use case. It
>> would certainly make the interface uglier, and more complicated, and I
>> would hate to have to carry that without a concrete use case.
>>
>> I think unless there is a solid use case for making clone/unshare
>> policy aware we don't worry about it for now. A new interface can be
>> add in the future if the capability is really needed.
> 
> We will respond more directly to the issue of clone, unshare and
> external process entry, in the other thread where you initiated a
> discussion of these issues.  We believe there is a strong argument to
> be made that LSM namespace separation is a poor fit for the classic
> fork/unshare model of the other resource namespaces.
> 
the other resource namespaces being able to move independent of the
security namespace, or at least mediation by the security namespace is
a complete disaster and should not have ever been allowed.

> Among other issues, a direct separation model places the complexity of
> policy verification and loading in userspace.  As was noted above,
> accounting for the security events related to the policy verification
> and load process, in the orchestrator process, will be a requirement
> for some integrity and functional models.
> 
There are different levels of verification. It makes sense to do some
of it in the individual LSM, some of it in userspace, and potentially
some at another level in another LSM. Unfortunately Linux has forced the
concept of containers to be a user level construct, and this forces
certain verifications around containers to be in userspace.

AppArmor does a policy verification checking that policy meet all the
bounding constraints etc. Is very different than the verification check
that IMA may doing check that this policy is blessed and allowed to be
loaded. AppArmor could support some IMA verification but is very much
designed to be like landlock in that unprivileged userspace _may_ have
privilege to load policy into the kernel. You may not want to allow
that on some systems, but you certainly do on others. The system level
signature check that IMA does isn't appropriate for unprivileged
user policy. But the apparmor verification check is.

and Yes something like IMA that is doing a system level integrity is going
to need a post policy load callback to do verification. This again doesn't
need an orchestrator, but just support in the infrastructure, and a
callback that individual LSMs can trigger. See the work Paul is doing
to rework the LSM init or how IMA is doing a verification of selinux
policy.

Of course you have to trust the LSMs to trigger the callback, but its
opersource and the code can be checked. If you can't trust the individual
LSMs you have a much bigger problem because you just can't trust a
monolithic kernel and you are going need a trust zone/hyper visor above
the kernel to do any form of integrity check you can trust.

> Have a good weekend.
> 
> As always,
> Dr. Greg
> 
> The Quixote Project - Flailing at the Travails of Cybersecurity
>                https://github.com/Quixote-Project
>