[PATCH v5 bpf-next 2/3] bpf: implement CAP_BPF

Tue May 12 02:36:41 UTC 2020

On Mon, May 11, 2020 at 05:12:10PM -0700, sdf at google.com wrote:
> On 05/08, Alexei Starovoitov wrote:
> > From: Alexei Starovoitov <ast at kernel.org>
> [..]
> > @@ -3932,7 +3977,7 @@ SYSCALL_DEFINE3(bpf, int, cmd, union bpf_attr
> > __user *, uattr, unsigned int, siz
> >   	union bpf_attr attr;
> >   	int err;
> 
> > -	if (sysctl_unprivileged_bpf_disabled && !capable(CAP_SYS_ADMIN))
> > +	if (sysctl_unprivileged_bpf_disabled && !bpf_capable())
> >   		return -EPERM;
> This is awesome, thanks for reviving the effort!
> 
> One question I have about this particular snippet:
> Does it make sense to drop bpf_capable checks for the operations
> that work on a provided fd?

Above snippet is for the case when sysctl switches unpriv off.
It was a big hammer and stays big hammer.
I certainly would like to improve the situation, but I suspect
the folks who turn that sysctl knob on are simply paranoid about bpf
and no amount of reasoning would turn them around.

> The use-case I have in mind is as follows:
> * privileged (CAP_BPF) process loads the programs/maps and pins
>   them at some known location
> * unprivileged process opens up those pins and does the following:
>   * prepares the maps (and will later on read them)
>   * does SO_ATTACH_BPF/SO_ATTACH_REUSEPORT_EBPF which afaik don't
>     require any capabilities
> 
> This essentially pushes some of the permission checks into a fs layer. So
> whoever has a file descriptor (via unix sock or open) can do BPF operations
> on the object that represents it.

cap_bpf doesn't change things in that regard.
Two cases here:
sysctl_unprivileged_bpf_disabled==0:
  Unpriv can load socket_filter prog type and unpriv can attach it
  via SO_ATTACH_BPF/SO_ATTACH_REUSEPORT_EBPF.
sysctl_unprivileged_bpf_disabled==1:
  cap_sys_admin can load socket_filter and unpriv can attach it.

With addition of cap_bpf in the second case cap_bpf process can
load socket_filter too.
It doesn't mean that permissions are pushed into fs layer.
I'm not sure that relaxing of sysctl_unprivileged_bpf_disabled
will be well received.
Are you proposing to selectively allow certain bpf syscall commands
even when sysctl_unprivileged_bpf_disabled==1 ?
Like allow unpriv to do BPF_OBJ_GET to get an fd from bpffs ?
And allow unpriv to do map_update ? 
It makes complete sense to me, but I'd like to argue about that
independently from this cap_bpf set.
We can relax that sysctl later.