[PATCH] xfs: use has_capability_noaudit() instead of capable() where appropriate
Ondrej Mosnacek
omosnace at redhat.com
Thu Mar 18 09:51:29 UTC 2021
On Tue, Mar 16, 2021 at 10:09 PM Dave Chinner <david at fromorbit.com> wrote:
> On Tue, Mar 16, 2021 at 06:32:26PM +0100, Ondrej Mosnacek wrote:
> > In cases when a negative result of a capability check doesn't lead to an
> > immediate, user-visible error, only a subtle difference in behavior, it
> > is better to use has_capability_noaudit(current, ...), so that LSMs
> > (e.g. SELinux) don't generate a denial record in the audit log each time
> > the capability status is queried. This patch should cover all such cases
> > in fs/xfs/.
>
> Is this something new? I only see 4 calls to
> has_capability_noaudit() in 5.12-rc3...
I don't know all the history, but I don't think it's new. It's just
that few people really are aware of the difference and no one from the
LSM/SELinux cared enough to maintain proper use across the kernel...
>
> Also, has_capability_noaudit() is an awful name. capable_noaudit()
> would actually be self explanatory to anyone who is used to doing
> capability checks via capable(), ns_capable(), ns_capable_noaudit(),
> inode_owner_or_capable(), capable_wrt_inode_uidgid(), etc...
>
> Please fix the name of this function to be consistent with the
> existing capability APIs before propagating it further into the
> kernel.
That's a fair point - I should take this opportunity to add the
missing function and add some documentation... I'll make sure to do
better in v2.
>
> > Note that I kept the capable(CAP_FSETID) checks, since these will only
> > be executed if the user explicitly tries to set the SUID/SGID bit, and
> > it likely makes sense to log such attempts even if the syscall doesn't
> > fail and just ignores the bits.
>
> So how on earth are we supposed to maintain this code correctly?
> These are undocumented rules that seemingly are applied to random
> subsystems and to seemingly random capable() calls in those
> subsystems. ANd you don't even document it in this code where there
> are other capable(...) checks that will generate audit records...
>
> How are we supposed to know when an audit record should be emitted
> or not by some unknown LSM when we do a capability check?
> Capabilities are already an awful nightmare maze of similar but
> slightly different capability checks, and this doesn't improve the
> situation at all.
>
> Please make this easier to get right iand maintain correctly (an
> absolute, non-negotiable requirement for anything security related)
> before you spray yet another poorly documented capability function
> into the wider kernel.
Again, you're right that I shouldn't have taken the lazy path :)
>
> > Signed-off-by: Ondrej Mosnacek <omosnace at redhat.com>
> > ---
> > fs/xfs/xfs_fsmap.c | 4 ++--
> > fs/xfs/xfs_ioctl.c | 5 ++++-
> > fs/xfs/xfs_iops.c | 6 ++++--
> > fs/xfs/xfs_xattr.c | 2 +-
> > 4 files changed, 11 insertions(+), 6 deletions(-)
> >
> > diff --git a/fs/xfs/xfs_fsmap.c b/fs/xfs/xfs_fsmap.c
> > index 9ce5e7d5bf8f..14672e7ee535 100644
> > --- a/fs/xfs/xfs_fsmap.c
> > +++ b/fs/xfs/xfs_fsmap.c
> > @@ -842,8 +842,8 @@ xfs_getfsmap(
> > !xfs_getfsmap_is_valid_device(mp, &head->fmh_keys[1]))
> > return -EINVAL;
> >
> > - use_rmap = capable(CAP_SYS_ADMIN) &&
> > - xfs_sb_version_hasrmapbt(&mp->m_sb);
> > + use_rmap = xfs_sb_version_hasrmapbt(&mp->m_sb) &&
> > + has_capability_noaudit(current, CAP_SYS_ADMIN);
> > head->fmh_entries = 0;
> >
> > /* Set up our device handlers. */
> > diff --git a/fs/xfs/xfs_ioctl.c b/fs/xfs/xfs_ioctl.c
> > index 3fbd98f61ea5..3cfc1a25069c 100644
> > --- a/fs/xfs/xfs_ioctl.c
> > +++ b/fs/xfs/xfs_ioctl.c
> > @@ -1470,8 +1470,11 @@ xfs_ioctl_setattr(
> >
> > if (XFS_IS_QUOTA_RUNNING(mp) && XFS_IS_PQUOTA_ON(mp) &&
> > ip->i_d.di_projid != fa->fsx_projid) {
> > + int flags = has_capability_noaudit(current, CAP_FOWNER) ?
> > + XFS_QMOPT_FORCE_RES : 0;
> > +
> > code = xfs_qm_vop_chown_reserve(tp, ip, NULL, NULL, pdqp,
> > - capable(CAP_FOWNER) ? XFS_QMOPT_FORCE_RES : 0);
> > + flags);
> > if (code) /* out of quota */
> > goto error_trans_cancel;
> > }
>
> You missed a capable() call here - see the call to
> xfs_trans_alloc_ichange( ... capabale(CAP_FOWNER), ...); in
> xfs_ioctl_setattr_get_trans().
Ah, I mistakenly based the path against an old tree. Sorry, I'll redo
it against xfs/for-next...
>
> > diff --git a/fs/xfs/xfs_iops.c b/fs/xfs/xfs_iops.c
> > index 67c8dc9de8aa..abbb417c4fbd 100644
> > --- a/fs/xfs/xfs_iops.c
> > +++ b/fs/xfs/xfs_iops.c
> > @@ -729,10 +729,12 @@ xfs_setattr_nonsize(
> > if (XFS_IS_QUOTA_RUNNING(mp) &&
> > ((XFS_IS_UQUOTA_ON(mp) && !uid_eq(iuid, uid)) ||
> > (XFS_IS_GQUOTA_ON(mp) && !gid_eq(igid, gid)))) {
> > + int flags = has_capability_noaudit(current, CAP_FOWNER) ?
> > + XFS_QMOPT_FORCE_RES : 0;
> > +
> > ASSERT(tp);
> > error = xfs_qm_vop_chown_reserve(tp, ip, udqp, gdqp,
> > - NULL, capable(CAP_FOWNER) ?
> > - XFS_QMOPT_FORCE_RES : 0);
> > + NULL, flags);
> > if (error) /* out of quota */
> > goto out_cancel;
> > }
>
> You missed a capable() call here - see the call to
> xfs_trans_alloc_ichange( ... capabale(CAP_FOWNER), ...); in this
> function.
>
> I think this demonstrates just how fragile and hard to maintain the
> approach being taken here is.
>
> > diff --git a/fs/xfs/xfs_xattr.c b/fs/xfs/xfs_xattr.c
> > index bca48b308c02..a99d19c2c11f 100644
> > --- a/fs/xfs/xfs_xattr.c
> > +++ b/fs/xfs/xfs_xattr.c
> > @@ -164,7 +164,7 @@ xfs_xattr_put_listent(
> > * Only show root namespace entries if we are actually allowed to
> > * see them.
> > */
> > - if (!capable(CAP_SYS_ADMIN))
> > + if (!has_capability_noaudit(current, CAP_SYS_ADMIN))
> > return;
> >
> > prefix = XATTR_TRUSTED_PREFIX;
>
> This one should absolutely report a denial - someone has tried to
> read the trusted xattr namespace without permission to do so. That's
> exactly the sort of thing I'd want to see in an audit log - just
> because we just elide the xattrs rather than return an error doesn't
> mean we should not leave an audit trail from the attempted access
> of kernel trusted attributes...
I'm not sure about that... without CAP_SYS_ADMIN the caller would
still get the ACL xattrs, no? IIUC, it's a filter to not show
restricted xattrs to unprivileged users via listxattr(2)**, where the
user is not saying "give me the trusted xattrs", just "give me
whatever I'm allowed to see", so logging the denial wouldn't make much
sense - the user may not even care about trusted xattrs when doing the
syscall (and in 99% of cases a user without CAP_SYS_ADMIN really
won't).
(**) But I don't understand how exactly that function is used and what
the XFS_ATTR_ROOT flag means, so I may be getting it wrong...
--
Ondrej Mosnacek
Software Engineer, Linux Security - SELinux kernel
Red Hat, Inc.
More information about the Linux-security-module-archive
mailing list