[PATCH v5 bpf-next 5/5] bpf/selftests: Add a selftest for bpf_getxattr

Christian Brauner brauner at kernel.org
Wed Jun 29 08:11:19 UTC 2022


On Tue, Jun 28, 2022 at 03:28:42PM -0700, Alexei Starovoitov wrote:
> On Tue, Jun 28, 2022 at 10:52 AM KP Singh <kpsingh at kernel.org> wrote:
> >
> > On Tue, Jun 28, 2022 at 7:33 PM Christian Brauner <brauner at kernel.org> wrote:
> > >
> > > On Tue, Jun 28, 2022 at 04:19:48PM +0000, KP Singh wrote:
> > > > A simple test that adds an xattr on a copied /bin/ls and reads it back
> > > > when the copied ls is executed.
> > > >
> > > > Signed-off-by: KP Singh <kpsingh at kernel.org>
> > > > ---
> > > >  .../testing/selftests/bpf/prog_tests/xattr.c  | 54 +++++++++++++++++++
> >
> > [...]
> >
> > > > +SEC("lsm.s/bprm_committed_creds")
> > > > +void BPF_PROG(bprm_cc, struct linux_binprm *bprm)
> > > > +{
> > > > +     struct task_struct *current = bpf_get_current_task_btf();
> > > > +     char dir_xattr_value[64] = {0};
> > > > +     int xattr_sz = 0;
> > > > +
> > > > +     xattr_sz = bpf_getxattr(bprm->file->f_path.dentry,
> > > > +                             bprm->file->f_path.dentry->d_inode, XATTR_NAME,
> > > > +                             dir_xattr_value, 64);
> > >
> > > Yeah, this isn't right. You're not accounting for the caller's userns
> > > nor for the idmapped mount. If this is supposed to work you will need a
> > > variant of vfs_getxattr() that takes the mount's idmapping into account
> > > afaict. See what needs to happen after do_getxattr().
> >
> > Thanks for taking a look.
> >
> > So, If I understand correctly, we don't need xattr_permission (and
> > other checks in
> > vfs_getxattr) here as the BPF programs run as CAP_SYS_ADMIN.
> >
> > but...
> >
> > So, Is this bit what's missing then?
> >
> > error = vfs_getxattr(mnt_userns, d, kname, ctx->kvalue, ctx->size);
> > if (error > 0) {
> >     if ((strcmp(kname, XATTR_NAME_POSIX_ACL_ACCESS) == 0) ||
> > (strcmp(kname, XATTR_NAME_POSIX_ACL_DEFAULT) == 0))
> >         posix_acl_fix_xattr_to_user(mnt_userns, d_inode(d),
> >             ctx->kvalue, error);
> 
> That will not be correct.
> posix_acl_fix_xattr_to_user checking current_user_ns()
> is checking random tasks that happen to be running
> when lsm hook got invoked.
> 
> KP,
> we probably have to document clearly that neither 'current*'
> should not be used here.
> xattr_permission also makes little sense in this context.
> If anything it can be a different kfunc if there is a use case,
> but I don't see it yet.
> bpf-lsm prog calling __vfs_getxattr is just like other lsm-s that
> call it directly. It's the kernel that is doing its security thing.

Right, but LSMs usually only retrieve their own xattr namespace (ima,
selinux, smack) or they calculate hashes for xattrs based on the raw
filesystem xattr values (evm).

But this new bpf_getxattr() is different. It allows to retrieve _any_
xattr in any security hook it can be attached to. So someone can write a
bpf program that retrieves filesystem capabilites or posix acls. And
these are xattrs that require higher-level vfs involvement to be
sensible in most contexts.

So looking at:

SEC("lsm.s/bprm_committed_creds")
void BPF_PROG(bprm_cc, struct linux_binprm *bprm)
{
	struct task_struct *current = bpf_get_current_task_btf();
	char dir_xattr_value[64] = {0};
	int xattr_sz = 0;

	xattr_sz = bpf_getxattr(bprm->file->f_path.dentry,
				bprm->file->f_path.dentry->d_inode, XATTR_NAME,
				dir_xattr_value, 64);

	if (xattr_sz <= 0)
		return;

	if (!bpf_strncmp(dir_xattr_value, sizeof(XATTR_VALUE), XATTR_VALUE))
		result = 1;
}

This hooks a bpf-lsm program to the security_bprm_committed_creds()
hook. It then retrieves the extended attributes of the file to be
executed. The hook currently always retrieves the raw filesystem values.

But for example any XATTR_NAME_CAPS filesystem capabilities that
might've been stored will be taken into account during exec. And both
the idmapping of the mount and the caller matter when determing whether
they are used or not.

But the current implementation of bpf_getxattr() just ignores both. It
will always retrieve the raw filesystem values. So if one invokes this
hook they're not actually retrieving the values as they are seen by
fs/exec.c. And I'm wondering why that is ok? And even if this is ok for
some use-cases it might very well become a security issue in others if
access decisions are always based on the raw values.

I'm not well-versed in this so bear with me, please.



More information about the Linux-security-module-archive mailing list