Closing the BPF map permission loophole

Fri Sep 30 09:56:45 UTC 2022

On Thu, 2022-09-29 at 18:30 -0400, Paul Moore wrote:
> On Thu, Sep 29, 2022 at 3:55 AM Roberto Sassu
> <roberto.sassu at huaweicloud.com> wrote:
> > On Wed, 2022-09-28 at 20:24 -0400, Paul Moore wrote:
> > > I only became aware of this when the LSM list was CC'd so I'm a
> > > little
> > > behind on what is going on here ... looking quickly through the
> > > mailing list archive it looks like there is an issue with BPF map
> > > permissions not matching well with their associated fd
> > > permissions,
> > > yes?  From a LSM perspective, there are a couple of hooks that
> > > currently use the fd's permissions (read/write) to determine the
> > > appropriate access control check.
> > 
> > From what I understood, access control on maps is done in two
> > steps.
> > First, whenever someone attempts to get a fd to a map
> > security_bpf_map() is called. LSM implementations could check
> > access if
> > the current process has the right to access the map (whose label
> > can be
> > assigned at map creation time with security_bpf_map_alloc()).
> 
> [NOTE: SELinux is currently the only LSM which provides BPF access
> controls, so they are going to be my example here and in the rest of
> this email.]
> 
> In the case of SELinux, security_bpf_map() does check the access
> between the current task and the BPF map itself (which inherits its
> security label from its creator), with the actual permission
> requested
> being determined by the fmode_t parameter passed to the LSM hook.
> Looking at the current BPF code, the callers seem to take that from
> various different places
> (bpf_attr:{file_flags,map_flags,open_flags}).
> This could be due solely to the different operations being done by
> the
> callers, which would make me believe everything is correct, but given
> this thread it seems reasonable for someone with a better
> understanding of BPF than me to double check this.  Can you help
> verify that everything is okay here?

I also don't have concerns on this part. eBPF maintainers can help to
confirm.

> Second, whenever the holder of the obtained fd wants to do an
> > operation
> > on the map (lookup, update, delete, ...), eBPF checks if the fd
> > modes
> > are compatible with the operation to perform (e.g. lookup requires
> > FMODE_CAN_READ).
> 
> To be clear, from what I can see, it looks like the LSM is not
> checking the fd modes, but rather the modes stored in the bpf_attr,
> which I get the impression do not always match the fd modes.  Yes?
> No?

Correct, there is no revalidation. Fd modes represent what LSMs granted
to the fd holder. eBPF allows operations on the object behind the fd
depending on the stored fd modes (for maps, not sure about the other
object types).

> There is also LSM/SELinux code which checks the permissions when a
> BPF
> map is passed between tasks via a fd.  Currently the SELinux check
> only looks at the file:f_mode to get the permissions to check, but if
> the f_mode bits are not the authoritative record of what is allowed
> in
> the BPF map, perhaps we need to change that to use one of the
> bpf_attr
> mode bits (my gut feeling is bpf_attr:open_flags)?
> 
> > One problem is that the second part is missing for some operations
> > dealing with the map fd:
> > 
> > Map iterators:
> > https://lore.kernel.org/bpf/20220906170301.256206-1-roberto.sassu@huaweicloud.com/
> 
> You'll need to treat me like an idiot when it comes to BPF maps ;)

Argh, sorry!

> I did a very quick read on them right now and it looks like a BPF map
> iterator would just be a combination of BPF read and execute ("bpf {
> map_read prog_run }" in SELinux policy terms).  Would it make more
> sense to just use the existing security_bpf_map() and
> security_bpf_prog() hooks here?

If I can make an example, if you have a linked list, a map iterator is
like calling a function for each element of the list, allowing the
function to read and write the element.

For now I didn't think about programs. Lorenz, which reported this
issue in the first place, is planning on rethinking access control on
all eBPF objects.

When you create a map iterator, you pass the program you want to run
and a map reference to bpftool (user space). bpftool first retrieves
the fds for the program and the map (thus, security_bpf_prog() and
security_bpf_map() are called). The fds are then passed to the kernel
with another system call to create the map iterator. The map iterator
is a named kernel object, with a corresponding file in the bpf
filesystem. Reading that file causes the iteration to start, by running
the program for each map element.

The only missing part is checking the fd modes granted by LSMs at the
time the map iterator is created. If the map iterator allows read and
write, both FMODE_CAN_READ and FMODE_CAN_WRITE should be set.

> > Map fd directly used by eBPF programs without system call:
> > https://lore.kernel.org/bpf/20220926154430.1552800-1-roberto.sassu@huaweicloud.com/
> 
> Another instance of "can you please explain this use case?" ;)

Lorenz discovered that. Despite LSMs check which permissions user space
requested for a map fd, there is no corresponding check of the fd
modes. Normally, a map fd is passed in a system call to perform a map
operation. In this case, it is not. It is set in the program code, and
eBPF transforms the map fd into a map address suitable to be passed to
kernel helpers which directly access the kernel object.

> I'm not going to hazard too much of a guess here, but if the map is
> being passed between tasks and a fd is generated from that map, we
> may
> be able to cover this with logic similar
> security/selinux/hooks.c:bpf_fd_pass() ... but I'm really stretching
> my weak understanding of BPF here.
> 
> > Another problem is that there is no DAC, only MAC (work in
> > progress). I
> > don't know exactly the status of enabling unprivileged eBPF.
> 
> It is my opinion that we need to ensure both DAC and MAC are present
> in the code.  This thread makes me worry that some eBPF DAC controls
> are being ignored because one can currently say "we're okay because
> you need privilege!".  That may be true today, but I can imagine a
> time in the not too distant future where unpriv eBPF is allowed and
> we
> suddenly have to bolt on a lot of capable() checks ... which is a
> great recipe for privilege escalation bugs.
> 
> > Apart from this, now the discussion is focusing on the following
> > problem. A map (kernel object) can be referenced in two ways: by ID
> > or
> > by path. By ID requires CAP_ADMIN, so we can consider by path for
> > now.
> > 
> > Given a map fd, the holder of that fd can create a new reference
> > (pinning) to the map in the bpf filesystem (a new file whose
> > private
> > data contains the address of the kernel object).
> > 
> > Pinning a map does not have a corresponding permission. Any fd mode
> > is
> > sufficient to do the operation. Furthermore, subsequent requests to
> > obtain a map fd by path could result in receiving a read-write fd,
> > while at the time of pinning the fd was read-only.
> 
> Since the maps carry their own label I think we are mostly okay, even
> if the map is passed between tasks by some non-fd related mechanism.
> However, I am now slightly worried that if a fd is obtained with
> perms
> that don't match the underlying map and that fd is then passed to
> another task the access control check on the fd passing would not be
> correct.  Operations on the map from a SELinux perspective should
> still be okay (the map has its own label), but still.
> 
> I'm wondering if we do want to move the SELinux BPF fd passing code
> to
> check the bpf_attr:open_flags perms.  Thoughts?

Seems correct as it is, but I'm not completely familiar with this work.
As long as you ensure that the fd holder has the right to access the
map, then it will be still responsibility of eBPF to just check the fd
modes.

> While this does not seem to me a concern from MAC perspective, as
> > attempts to get a map fd still have to pass through
> > security_bpf_map(),
> > in general this should be fixed without relying on LSMs.
> 
> Agreed.  The access controls need to work both for DAC and DAC+LSM.

@all, what do you think?

> Is the plan to ensure that the map and fd permissions are correct
> > > at
> > > the core BPF level, or do we need to do some additional checks in
> > > the
> > > LSMs (currently only SELinux)?
> > 
> > Should we add a new map_pin permission in SELinux?
> 
> Maybe?  Maybe not?  I don't know, can you help me understand map
> pinning a bit more first?

I'm not completely sure that this is correct. Pinning a map seems to me
like creating a new dentry for the inode.

> Should we have DAC to restrict pinnning without LSMs?
> 
> Similar to above.

If we had DAC, even without restricting pinning, fds have to be
obtained if the DAC check passed (if we don't want to rely exclusively

on MAC). Even if pinning was done with a read-only fd, a new read-write fd can be obtained if the owner allowed you to do so.

Restricting pinning might be risky to break compatibility with existing
applications.

Roberto