[RFC 17/20] ima: Use integrity_admin_ns_capable() to check corresponding capability

Thu Dec 2 15:58:12 UTC 2021

On 12/2/2021 5:01 AM, Christian Brauner wrote:
> On Thu, Dec 02, 2021 at 01:59:55PM +0100, Christian Brauner wrote:
>> On Wed, Dec 01, 2021 at 02:29:09PM -0500, James Bottomley wrote:
>>> On Wed, 2021-12-01 at 12:35 -0500, Stefan Berger wrote:
>>>> On 12/1/21 11:58, James Bottomley wrote:
>>>>> On Tue, 2021-11-30 at 11:06 -0500, Stefan Berger wrote:
>>>>>> From: Denis Semakin <denis.semakin at huawei.com>
>>>>>>
>>>>>> Use integrity_admin_ns_capable() to check corresponding
>>>>>> capability to allow read/write IMA policy without CAP_SYS_ADMIN
>>>>>> but with CAP_INTEGRITY_ADMIN.
>>>>>>
>>>>>> Signed-off-by: Denis Semakin <denis.semakin at huawei.com>
>>>>>> ---
>>>>>>    security/integrity/ima/ima_fs.c | 2 +-
>>>>>>    1 file changed, 1 insertion(+), 1 deletion(-)
>>>>>>
>>>>>> diff --git a/security/integrity/ima/ima_fs.c
>>>>>> b/security/integrity/ima/ima_fs.c
>>>>>> index fd2798f2d224..6766bb8262f2 100644
>>>>>> --- a/security/integrity/ima/ima_fs.c
>>>>>> +++ b/security/integrity/ima/ima_fs.c
>>>>>> @@ -393,7 +393,7 @@ static int ima_open_policy(struct inode
>>>>>> *inode,
>>>>>> struct file *filp)
>>>>>>    #else
>>>>>>    		if ((filp->f_flags & O_ACCMODE) != O_RDONLY)
>>>>>>    			return -EACCES;
>>>>>> -		if (!ns_capable(ns->user_ns, CAP_SYS_ADMIN))
>>>>>> +		if (!integrity_admin_ns_capable(ns->user_ns))
>>>>> so this one is basically replacing what you did in RFC 16/20, which
>>>>> seems a little redundant.
>>>>>
>>>>> The question I'd like to ask is: is there still a reason for
>>>>> needing CAP_INTEGRITY_ADMIN?  My thinking is that now IMA is pretty
>>>>> much tied to requiring a user (and a mount, because of
>>>>> securityfs_ns) namespace, there might not be a pressing need for an
>>>>> admin capability separated from CAP_SYS_ADMIN because the owner of
>>>>> the user namespace passes the ns_capable(..., CAP_SYS_ADMIN)
>>>>> check.  The rationale in
>>>> Casey suggested using CAP_MAC_ADMIN, which I think would also work.
>>>>
>>>>       CAP_MAC_ADMIN (since Linux 2.6.25)
>>>>                 Allow MAC configuration or state changes. Implemented
>>>> for
>>>>                 the Smack Linux Security Module (LSM).
>>>>
>>>>
>>>> Down the road I think we should cover setting file extended
>>>> attributes with the same capability as well for when a user signs
>>>> files or installs packages with file signatures.  A container runtime
>>>> could hold CAP_SYS_ADMIN while setting up a container and mounting
>>>> filesystems and drop it for the first process started there. Since we
>>>> are using the user namespace to spawn an IMA namespace, we would then
>>>> require CAP_SYSTEM_ADMIN to be left available so that the user can do
>>>> IMA related stuff in the container (set or append to the policy,
>>>> write file signatures). I am not sure whether that should be the case
>>>> or rather give the user something finer grained, such as
>>>> CAP_MAC_ADMIN. So, it's about granularity...

The important rationale for capabilities is separation
of privilege from user id. Granularity has always been a
contentious issue. Whether you use CAP_SYS_ADMIN or CAP_MAC_ADMIN
you are using privilege, and need to be diligent.

>>> It's possible ... any orchestration system that doesn't enter a user
>>> namespace has to strictly regulate capabilities.   I'm probably biased
>>> because I always use a user_ns so I never really had to mess with
>>> capabilities.
>>>
>>>>> https://kernsec.org/wiki/index.php/IMA_Namespacing_design_considerations
>>>>>
>>>>> Is effectively "because CAP_SYS_ADMIN is too powerful" but that's
>>>>> no longer true of the user namespace owner.  It only passes the
>>>>> ns_capable() check not the capable() one, so while it does get
>>>>> CAP_SYS_ADMIN, it can only use it in a few situations which
>>>>> represent quite a power reduction already.
>>>> At least docker containers drop CAP_SYS_ADMIN.
>>> Well docker doesn't use the user_ns.  But even given that,
>>> CAP_SYS_ADMIN is always dropped for most container systems.  What
>>> happens when you enter a user namespace is the ns_capable( ...,
>>> CAP_SYS_ADMIN) check returns true if you're the owner of the user_ns,
>>> in the same way it would for root.  So effectively entering a user
>>> namespace without CAP_SYS_ADMIN but mapping the owner id to 0 (what
>>> unshare -r --user does) gives you back a form of CAP_SYS_ADMIN that
>>> responds only in the places in the kernel that have a ns_capable()
>>> check instead of a capable() one (most of the places you list below).
>>> This is the principle of how unprivileged containers actually work ...
>>> and the source of some of our security problems if you get back an
>>> ability to do something you shouldn't be allowed to do as an
>>> unprivileged user.
>>>
>>>>   I am not sure what the decision was based on but probably they don't
>>>> want to give the user what is not absolutely necessary, but usage of
>>>> user namespaces (with IMA namespaces) would kind of force it to be
>>>> available then to do IMA-related stuff ...
>>>>
>>>> Following this man page here
>>>> https://man7.org/linux/man-pages/man7/user_namespaces.7.html
>>>>
>>>> CAP_SYS_ADMIN in a user namespace is about
>>>>
>>>> - bind-mounting filesystems
>>>>
>>>> - mounting /proc filesystems
>>>>
>>>> - creating nested user namespaces
>>>>
>>>> - configuring UTS namespace
>>>>
>>>> - configuring whether setgroups() can be used
>>>>
>>>> - usage of setns()
>>>>
>>>>
>>>> Do we want to add '- only way of *setting up* IMA related stuff' to
>>>> this list?
>>> I don't see why not, but other container people should weigh in
>>> because, as I said, I mostly use the user namespace and unprivileged
>>> containers and don't bother with capabilities.
>> There are very few scenarios where dropping capabilities in an
>> unprivileged container makes sense. In a lot of other scenarios it is
>> just a misunderstanding of the meaning of capabilities and their
>> relationship to user namespaces. Usually, granting a full set of
>> capabilities to the payload of an unprivigileged container is the right
>> thing to do. All things that are properly namespaced will check
>> capabilities in the relevant user namespace. Those that aren't will
>> check them against the initial user namespaces.
>>
>> But I do think the question of whether or not ima should go into
>> cap_sys_admin is more a question of capability semantics then it is in
>> how exactly ima is namespaced. We do have agreed before that overloading
>> cap_sys_admin further isn't ideal. Often we end up rectifying that
>> mistake later. For example, how we moved stuff like criu, bpf, and perf
>> to their own capability. Now we're left with stuff like:
>>
>> static inline bool perfmon_capable(void)
>> {
>> 	return capable(CAP_PERFMON) || capable(CAP_SYS_ADMIN);
>> }
>>
>> static inline bool bpf_capable(void)
>> {
>> 	return capable(CAP_BPF) || capable(CAP_SYS_ADMIN);
>> }
>>
>> static inline bool checkpoint_restore_ns_capable(struct user_namespace *ns)
>> {
>> 	return ns_capable(ns, CAP_CHECKPOINT_RESTORE) ||
>> 		ns_capable(ns, CAP_SYS_ADMIN);
>> }
>>
>> for the sake of adhering to legacy behavior. I think we can skip over
>> that mistake and introduce cap_sys_integrity.
> (Or under CAP_MAC_ADMIN as suggested elsewhere in the thread as I saw
> just now.)