LSM namespacing API

Thu Aug 21 14:57:11 UTC 2025

On 8/21/25 07:26, Serge E. Hallyn wrote:
> On Thu, Aug 21, 2025 at 12:46:10AM -0700, John Johansen wrote:
>> On 8/19/25 10:47, Stephen Smalley wrote:
>>> On Tue, Aug 19, 2025 at 10:56 AM Paul Moore <paul at paul-moore.com> wrote:
>>>>
>>>> Hello all,
>>>>
>>>> As most of you are likely aware, Stephen Smalley has been working on
>>>> adding namespace support to SELinux, and the work has now progressed
>>>> to the point where a serious discussion on the API is warranted.  For
>>>> those of you are unfamiliar with the details or Stephen's patchset, or
>>>> simply need a refresher, he has some excellent documentation in his
>>>> work-in-progress repo:
>>>>
>>>> * https://github.com/stephensmalley/selinuxns
>>>>
>>>> Stephen also gave a (pre-recorded) presentation at LSS-NA this year
>>>> about SELinux namespacing, you can watch the presentation here:
>>>>
>>>> * https://www.youtube.com/watch?v=AwzGCOwxLoM
>>>>
>>>> In the past you've heard me state, rather firmly at times, that I
>>>> believe namespacing at the LSM framework layer to be a mistake,
>>>> although if there is something that can be done to help facilitate the
>>>> namespacing of individual LSMs at the framework layer, I would be
>>>> supportive of that.  I think that a single LSM namespace API, similar
>>>> to our recently added LSM syscalls, may be such a thing, so I'd like
>>>> us to have a discussion to see if we all agree on that, and if so,
>>>> what such an API might look like.
>>>>
>>>> At LSS-NA this year, John Johansen and I had a brief discussion where
>>>> he suggested a single LSM wide clone*(2) flag that individual LSM's
>>>> could opt into via callbacks.  John is directly CC'd on this mail, so
>>>> I'll let him expand on this idea.
>>>>
>>>> While I agree with John that a fs based API is problematic (see all of
>>>> our discussions around the LSM syscalls), I'm concerned that a single
>>>> clone*(2) flag will significantly limit our flexibility around how
>>>> individual LSMs are namespaced, something I don't want to see happen.
>>>> This makes me wonder about the potential for expanding
>>>> lsm_set_self_attr(2) to support a new LSM attribute that would support
>>>> a namespace "unshare" operation, e.g. LSM_ATTR_UNSHARE.  This would
>>>> provide a single LSM framework API for an unshare operation while also
>>>> providing a mechanism to pass LSM specific via the lsm_ctx struct if
>>>> needed.  Just as we do with the other LSM_ATTR_* flags today,
>>>> individual LSMs can opt-in to the API fairly easily by providing a
>>>> setselfattr() LSM callback.
>>>>
>>>> Thoughts?
>>>
>>> I think we want to be able to unshare a specific security module
>>> namespace without unsharing the others, i.e. just SELinux or just
>>> AppArmor.
>>
>> yes which is part of the problem with the single flag. That choice
>> would be entirely at the policy level, without any input from userspace.
> 
> AIUI Paul's suggestion is the user can pre-set the details of which
> lsms to unshare and how with the lsm_set_self_attr(), and then a
> single CLONE_LSM effects that.
> 
yes, I was specifically addressing the conversation I had with Paul at
LSS that Paul brought up. That is

   At LSS-NA this year, John Johansen and I had a brief discussion where
   he suggested a single LSM wide clone*(2) flag that individual LSM's
   could opt into via callbacks.

the idea there isn't all that different than what Paul proposed. You
could have a single flag, if you can provide ancillary information. But
a single flag on its own isn't sufficient.

You can do a subset with a single flag and only policy directing things,
but that would cut container managers out of the decision. Without a
universal container identifier that really limits what you can do. In
another email I likend it to the MCS label approach to the container
where you have a single security policy for the container and each
container gets to be a unique instance of that policy. Its not a perfect
analogy as with namespace policy can be loaded into the namespace making
it unique. I don't think the approach is right because not all namespaces
implement a loadable policy, and even when they do I think we can do a
better job if the container manager is allowed to provide additional
context with the namespacing request.