[PATCH] xfs: use has_capability_noaudit() instead of capable() where appropriate

Tue Mar 16 20:50:10 UTC 2021

On Tue, Mar 16, 2021 at 06:32:26PM +0100, Ondrej Mosnacek wrote:
> In cases when a negative result of a capability check doesn't lead to an
> immediate, user-visible error, only a subtle difference in behavior, it
> is better to use has_capability_noaudit(current, ...), so that LSMs
> (e.g. SELinux) don't generate a denial record in the audit log each time
> the capability status is queried. This patch should cover all such cases
> in fs/xfs/.

Is this something new? I only see 4 calls to
has_capability_noaudit() in 5.12-rc3...

Also, has_capability_noaudit() is an awful name. capable_noaudit()
would actually be self explanatory to anyone who is used to doing
capability checks via capable(), ns_capable(), ns_capable_noaudit(),
inode_owner_or_capable(), capable_wrt_inode_uidgid(), etc...

Please fix the name of this function to be consistent with the
existing capability APIs before propagating it further into the
kernel.

> Note that I kept the capable(CAP_FSETID) checks, since these will only
> be executed if the user explicitly tries to set the SUID/SGID bit, and
> it likely makes sense to log such attempts even if the syscall doesn't
> fail and just ignores the bits.

So how on earth are we supposed to maintain this code correctly?
These are undocumented rules that seemingly are applied to random
subsystems and to seemingly random capable() calls in those
subsystems. ANd you don't even document it in this code where there
are other capable(...) checks that will generate audit records...

How are we supposed to know when an audit record should be emitted
or not by some unknown LSM when we do a capability check?
Capabilities are already an awful nightmare maze of similar but
slightly different capability checks, and this doesn't improve the
situation at all.

Please make this easier to get right iand maintain correctly (an
absolute, non-negotiable requirement for anything security related)
before you spray yet another poorly documented capability function
into the wider kernel.

> Signed-off-by: Ondrej Mosnacek <omosnace at redhat.com>
> ---
>  fs/xfs/xfs_fsmap.c | 4 ++--
>  fs/xfs/xfs_ioctl.c | 5 ++++-
>  fs/xfs/xfs_iops.c  | 6 ++++--
>  fs/xfs/xfs_xattr.c | 2 +-
>  4 files changed, 11 insertions(+), 6 deletions(-)
> 
> diff --git a/fs/xfs/xfs_fsmap.c b/fs/xfs/xfs_fsmap.c
> index 9ce5e7d5bf8f..14672e7ee535 100644
> --- a/fs/xfs/xfs_fsmap.c
> +++ b/fs/xfs/xfs_fsmap.c
> @@ -842,8 +842,8 @@ xfs_getfsmap(
>  	    !xfs_getfsmap_is_valid_device(mp, &head->fmh_keys[1]))
>  		return -EINVAL;
>  
> -	use_rmap = capable(CAP_SYS_ADMIN) &&
> -		   xfs_sb_version_hasrmapbt(&mp->m_sb);
> +	use_rmap = xfs_sb_version_hasrmapbt(&mp->m_sb) &&
> +		   has_capability_noaudit(current, CAP_SYS_ADMIN);
>  	head->fmh_entries = 0;
>  
>  	/* Set up our device handlers. */
> diff --git a/fs/xfs/xfs_ioctl.c b/fs/xfs/xfs_ioctl.c
> index 3fbd98f61ea5..3cfc1a25069c 100644
> --- a/fs/xfs/xfs_ioctl.c
> +++ b/fs/xfs/xfs_ioctl.c
> @@ -1470,8 +1470,11 @@ xfs_ioctl_setattr(
>  
>  	if (XFS_IS_QUOTA_RUNNING(mp) && XFS_IS_PQUOTA_ON(mp) &&
>  	    ip->i_d.di_projid != fa->fsx_projid) {
> +		int flags = has_capability_noaudit(current, CAP_FOWNER) ?
> +			XFS_QMOPT_FORCE_RES : 0;
> +
>  		code = xfs_qm_vop_chown_reserve(tp, ip, NULL, NULL, pdqp,
> -				capable(CAP_FOWNER) ?  XFS_QMOPT_FORCE_RES : 0);
> +				flags);
>  		if (code)	/* out of quota */
>  			goto error_trans_cancel;
>  	}

You missed a capable() call here - see the call to
xfs_trans_alloc_ichange( ... capabale(CAP_FOWNER), ...); in
xfs_ioctl_setattr_get_trans().

> diff --git a/fs/xfs/xfs_iops.c b/fs/xfs/xfs_iops.c
> index 67c8dc9de8aa..abbb417c4fbd 100644
> --- a/fs/xfs/xfs_iops.c
> +++ b/fs/xfs/xfs_iops.c
> @@ -729,10 +729,12 @@ xfs_setattr_nonsize(
>  		if (XFS_IS_QUOTA_RUNNING(mp) &&
>  		    ((XFS_IS_UQUOTA_ON(mp) && !uid_eq(iuid, uid)) ||
>  		     (XFS_IS_GQUOTA_ON(mp) && !gid_eq(igid, gid)))) {
> +			int flags = has_capability_noaudit(current, CAP_FOWNER) ?
> +				XFS_QMOPT_FORCE_RES : 0;
> +
>  			ASSERT(tp);
>  			error = xfs_qm_vop_chown_reserve(tp, ip, udqp, gdqp,
> -						NULL, capable(CAP_FOWNER) ?
> -						XFS_QMOPT_FORCE_RES : 0);
> +						NULL, flags);
>  			if (error)	/* out of quota */
>  				goto out_cancel;
>  		}

You missed a capable() call here - see the call to
xfs_trans_alloc_ichange( ... capabale(CAP_FOWNER), ...); in this
function.

I think this demonstrates just how fragile and hard to maintain the
approach being taken here is.

> diff --git a/fs/xfs/xfs_xattr.c b/fs/xfs/xfs_xattr.c
> index bca48b308c02..a99d19c2c11f 100644
> --- a/fs/xfs/xfs_xattr.c
> +++ b/fs/xfs/xfs_xattr.c
> @@ -164,7 +164,7 @@ xfs_xattr_put_listent(
>  		 * Only show root namespace entries if we are actually allowed to
>  		 * see them.
>  		 */
> -		if (!capable(CAP_SYS_ADMIN))
> +		if (!has_capability_noaudit(current, CAP_SYS_ADMIN))
>  			return;
>  
>  		prefix = XATTR_TRUSTED_PREFIX;

This one should absolutely report a denial - someone has tried to
read the trusted xattr namespace without permission to do so. That's
exactly the sort of thing I'd want to see in an audit log - just
because we just elide the xattrs rather than return an error doesn't
mean we should not leave an audit trail from the attempted access
of kernel trusted attributes...

Cheers,

Dave.
-- 
Dave Chinner
david at fromorbit.com