[PATCH] xfs: use has_capability_noaudit() instead of capable() where appropriate

Thu Mar 18 09:51:29 UTC 2021

On Tue, Mar 16, 2021 at 10:09 PM Dave Chinner <david at fromorbit.com> wrote:
> On Tue, Mar 16, 2021 at 06:32:26PM +0100, Ondrej Mosnacek wrote:
> > In cases when a negative result of a capability check doesn't lead to an
> > immediate, user-visible error, only a subtle difference in behavior, it
> > is better to use has_capability_noaudit(current, ...), so that LSMs
> > (e.g. SELinux) don't generate a denial record in the audit log each time
> > the capability status is queried. This patch should cover all such cases
> > in fs/xfs/.
>
> Is this something new? I only see 4 calls to
> has_capability_noaudit() in 5.12-rc3...

I don't know all the history, but I don't think it's new. It's just
that few people really are aware of the difference and no one from the
LSM/SELinux cared enough to maintain proper use across the kernel...

>
> Also, has_capability_noaudit() is an awful name. capable_noaudit()
> would actually be self explanatory to anyone who is used to doing
> capability checks via capable(), ns_capable(), ns_capable_noaudit(),
> inode_owner_or_capable(), capable_wrt_inode_uidgid(), etc...
>
> Please fix the name of this function to be consistent with the
> existing capability APIs before propagating it further into the
> kernel.

That's a fair point - I should take this opportunity to add the
missing function and add some documentation... I'll make sure to do
better in v2.

>
> > Note that I kept the capable(CAP_FSETID) checks, since these will only
> > be executed if the user explicitly tries to set the SUID/SGID bit, and
> > it likely makes sense to log such attempts even if the syscall doesn't
> > fail and just ignores the bits.
>
> So how on earth are we supposed to maintain this code correctly?
> These are undocumented rules that seemingly are applied to random
> subsystems and to seemingly random capable() calls in those
> subsystems. ANd you don't even document it in this code where there
> are other capable(...) checks that will generate audit records...
>
> How are we supposed to know when an audit record should be emitted
> or not by some unknown LSM when we do a capability check?
> Capabilities are already an awful nightmare maze of similar but
> slightly different capability checks, and this doesn't improve the
> situation at all.
>
> Please make this easier to get right iand maintain correctly (an
> absolute, non-negotiable requirement for anything security related)
> before you spray yet another poorly documented capability function
> into the wider kernel.

Again, you're right that I shouldn't have taken the lazy path :)

>
> > Signed-off-by: Ondrej Mosnacek <omosnace at redhat.com>
> > ---
> >  fs/xfs/xfs_fsmap.c | 4 ++--
> >  fs/xfs/xfs_ioctl.c | 5 ++++-
> >  fs/xfs/xfs_iops.c  | 6 ++++--
> >  fs/xfs/xfs_xattr.c | 2 +-
> >  4 files changed, 11 insertions(+), 6 deletions(-)
> >
> > diff --git a/fs/xfs/xfs_fsmap.c b/fs/xfs/xfs_fsmap.c
> > index 9ce5e7d5bf8f..14672e7ee535 100644
> > --- a/fs/xfs/xfs_fsmap.c
> > +++ b/fs/xfs/xfs_fsmap.c
> > @@ -842,8 +842,8 @@ xfs_getfsmap(
> >           !xfs_getfsmap_is_valid_device(mp, &head->fmh_keys[1]))
> >               return -EINVAL;
> >
> > -     use_rmap = capable(CAP_SYS_ADMIN) &&
> > -                xfs_sb_version_hasrmapbt(&mp->m_sb);
> > +     use_rmap = xfs_sb_version_hasrmapbt(&mp->m_sb) &&
> > +                has_capability_noaudit(current, CAP_SYS_ADMIN);
> >       head->fmh_entries = 0;
> >
> >       /* Set up our device handlers. */
> > diff --git a/fs/xfs/xfs_ioctl.c b/fs/xfs/xfs_ioctl.c
> > index 3fbd98f61ea5..3cfc1a25069c 100644
> > --- a/fs/xfs/xfs_ioctl.c
> > +++ b/fs/xfs/xfs_ioctl.c
> > @@ -1470,8 +1470,11 @@ xfs_ioctl_setattr(
> >
> >       if (XFS_IS_QUOTA_RUNNING(mp) && XFS_IS_PQUOTA_ON(mp) &&
> >           ip->i_d.di_projid != fa->fsx_projid) {
> > +             int flags = has_capability_noaudit(current, CAP_FOWNER) ?
> > +                     XFS_QMOPT_FORCE_RES : 0;
> > +
> >               code = xfs_qm_vop_chown_reserve(tp, ip, NULL, NULL, pdqp,
> > -                             capable(CAP_FOWNER) ?  XFS_QMOPT_FORCE_RES : 0);
> > +                             flags);
> >               if (code)       /* out of quota */
> >                       goto error_trans_cancel;
> >       }
>
> You missed a capable() call here - see the call to
> xfs_trans_alloc_ichange( ... capabale(CAP_FOWNER), ...); in
> xfs_ioctl_setattr_get_trans().

Ah, I mistakenly based the path against an old tree. Sorry, I'll redo
it against xfs/for-next...

>
> > diff --git a/fs/xfs/xfs_iops.c b/fs/xfs/xfs_iops.c
> > index 67c8dc9de8aa..abbb417c4fbd 100644
> > --- a/fs/xfs/xfs_iops.c
> > +++ b/fs/xfs/xfs_iops.c
> > @@ -729,10 +729,12 @@ xfs_setattr_nonsize(
> >               if (XFS_IS_QUOTA_RUNNING(mp) &&
> >                   ((XFS_IS_UQUOTA_ON(mp) && !uid_eq(iuid, uid)) ||
> >                    (XFS_IS_GQUOTA_ON(mp) && !gid_eq(igid, gid)))) {
> > +                     int flags = has_capability_noaudit(current, CAP_FOWNER) ?
> > +                             XFS_QMOPT_FORCE_RES : 0;
> > +
> >                       ASSERT(tp);
> >                       error = xfs_qm_vop_chown_reserve(tp, ip, udqp, gdqp,
> > -                                             NULL, capable(CAP_FOWNER) ?
> > -                                             XFS_QMOPT_FORCE_RES : 0);
> > +                                             NULL, flags);
> >                       if (error)      /* out of quota */
> >                               goto out_cancel;
> >               }
>
> You missed a capable() call here - see the call to
> xfs_trans_alloc_ichange( ... capabale(CAP_FOWNER), ...); in this
> function.
>
> I think this demonstrates just how fragile and hard to maintain the
> approach being taken here is.
>
> > diff --git a/fs/xfs/xfs_xattr.c b/fs/xfs/xfs_xattr.c
> > index bca48b308c02..a99d19c2c11f 100644
> > --- a/fs/xfs/xfs_xattr.c
> > +++ b/fs/xfs/xfs_xattr.c
> > @@ -164,7 +164,7 @@ xfs_xattr_put_listent(
> >                * Only show root namespace entries if we are actually allowed to
> >                * see them.
> >                */
> > -             if (!capable(CAP_SYS_ADMIN))
> > +             if (!has_capability_noaudit(current, CAP_SYS_ADMIN))
> >                       return;
> >
> >               prefix = XATTR_TRUSTED_PREFIX;
>
> This one should absolutely report a denial - someone has tried to
> read the trusted xattr namespace without permission to do so. That's
> exactly the sort of thing I'd want to see in an audit log - just
> because we just elide the xattrs rather than return an error doesn't
> mean we should not leave an audit trail from the attempted access
> of kernel trusted attributes...

I'm not sure about that... without CAP_SYS_ADMIN the caller would
still get the ACL xattrs, no? IIUC, it's a filter to not show
restricted xattrs to unprivileged users via listxattr(2)**, where the
user is not saying "give me the trusted xattrs", just "give me
whatever I'm allowed to see", so logging the denial wouldn't make much
sense - the user may not even care about trusted xattrs when doing the
syscall (and in 99% of cases a user without CAP_SYS_ADMIN really
won't).

(**) But I don't understand how exactly that function is used and what
the XFS_ATTR_ROOT flag means, so I may be getting it wrong...

--
Ondrej Mosnacek
Software Engineer, Linux Security - SELinux kernel
Red Hat, Inc.