LSM namespacing API

Thu Aug 21 08:12:15 UTC 2025

On 8/20/25 19:35, Paul Moore wrote:
> On Wed, Aug 20, 2025 at 10:05 PM Serge E. Hallyn <serge at hallyn.com> wrote:
>> On Tue, Aug 19, 2025 at 02:51:00PM -0400, Paul Moore wrote:
>>> On Tue, Aug 19, 2025 at 1:47 PM Stephen Smalley
>>> <stephen.smalley.work at gmail.com> wrote:
> 
> ...
> 
>>>> Serge pointed out that we also will need an API to attach to an
>>>> existing SELinux namespace, which I captured here:
>>>> https://github.com/stephensmalley/selinuxns/issues/19
>>>> This is handled for other Linux namespaces by opening a pseudo file
>>>> under /proc/pid/ns and invoking setns(2), so not sure how we want to
>>>> do it.
>>>
>>> One option would be to have a the LSM framework return a LSM namespace
>>> "handle" for a given LSM using lsm_get_self_attr(2) and then do a
>>> setns(2)-esque operation using lsm_set_self_attr(2) with that
>>> "handle".  We would need to figure out what would constitute a
>>> "handle" but let's just mark that as TBD for now with this approach (I
>>> think better options are available).
>>
>> The use case which would be complicated (not blocked) by this, is
>>
>> * a runtime creates a process p1
>>    * p1 unshares its lsm namespace
>> * runtime forks a debug/admin process p2
>>    * p2 wants to enter p1's namespace
>>
>> Of course the runtime could work around it by, before relinquishing
>> control of p1 to a new executable, returning the lsm_get_self_attr()
>> data to over a pipe.
>>
>> Note I don't think we should support setting another task's namespace,
>> only getting its namespace ID.
>>
>>> Since we have an existing LSM namespace combination, with processes
>>> running inside of it, it might be sufficient to simply support moving
>>> into an existing LSM namespace set with setns(2) using only a pidfd
>>> and a new CLONE_LSMNS flag (or similar, upstream might want this as
>>> CLONE_NEWLSM).  This would simply set the LSM namespace set for the
>>> setns(2) caller to match that of the target pidfd.  We still wouldn't
>>> want to support CLONE_LSMNS/CLONE_NEWLSM for clone*().
>>
>> A part of me is telling (another part of) me that being able to setns
>> to a subset of the lsms could lead to privilege escapes through
>> weird policy configurations for the various LSMs.  In which case,
>> an all-or-nothing LSM setns might actually be preferable.
> 
> Sorry I probably wasn't as clear as I should have been, but my idea
> with using the existing procfs/setns(2) approach with a single
> CLONE_NEWLSM (name pending sufficient bikeshedding) was that the
> process being setns()'d would simply end up in the exact copy of the
> target process' LSM namespace configuration, it shouldn't be a new
> set/subset/configuration ... and I would expect us to have controls
> around that such that LSMs could enforce policy on a setns(2)
> operation that involved their LSM.
> 
entering as a complete set, is certainly the safest. At a minim the
LSMs are going to need to be able to specify the set of namespaces
the are needed if you enter the LSM namespace. The easiest way to
do this is what you propose, take away the flexibility and allow
moving everything as a set.

I do think we might still have a need to be able to request entering
an LSM namespace from the set, but I think that at least for a first
its probably better to not go there.