[PATCH v5 bpf-next 2/3] bpf: implement CAP_BPF

Tue May 12 18:39:11 UTC 2020

On Tue, May 12, 2020 at 08:54:11AM -0700, sdf at google.com wrote:
> On 05/11, Alexei Starovoitov wrote:
> > On Mon, May 11, 2020 at 05:12:10PM -0700, sdf at google.com wrote:
> > > On 05/08, Alexei Starovoitov wrote:
> > > > From: Alexei Starovoitov <ast at kernel.org>
> > > [..]
> > > > @@ -3932,7 +3977,7 @@ SYSCALL_DEFINE3(bpf, int, cmd, union bpf_attr
> > > > __user *, uattr, unsigned int, siz
> > > >   	union bpf_attr attr;
> > > >   	int err;
> > >
> > > > -	if (sysctl_unprivileged_bpf_disabled && !capable(CAP_SYS_ADMIN))
> > > > +	if (sysctl_unprivileged_bpf_disabled && !bpf_capable())
> > > >   		return -EPERM;
> > > This is awesome, thanks for reviving the effort!
> > >
> > > One question I have about this particular snippet:
> > > Does it make sense to drop bpf_capable checks for the operations
> > > that work on a provided fd?
> 
> > Above snippet is for the case when sysctl switches unpriv off.
> > It was a big hammer and stays big hammer.
> > I certainly would like to improve the situation, but I suspect
> > the folks who turn that sysctl knob on are simply paranoid about bpf
> > and no amount of reasoning would turn them around.
> Yeah, and we do use it unfortunately :-( I suppose we still would
> like to keep it that way for a while, but maybe start relaxing
> some operations a bit.

I suspect that was done couple years ago when spectre was just discovered and
the verifier wasn't doing speculative analysis.
If your security folks still insist on that sysctl they either didn't
follow bpf development or paranoid.
If former is the case I would love to openly discuss all the advances in
the verification logic to prevent side channels.
The verifier is doing amazing job finding bad assembly code.
There is no other tool that is similarly capable.
All compilers and static analyzers are no where close to the level
of sophistication that the verifier has in detection of bad speculation.

> > > The use-case I have in mind is as follows:
> > > * privileged (CAP_BPF) process loads the programs/maps and pins
> > >   them at some known location
> > > * unprivileged process opens up those pins and does the following:
> > >   * prepares the maps (and will later on read them)
> > >   * does SO_ATTACH_BPF/SO_ATTACH_REUSEPORT_EBPF which afaik don't
> > >     require any capabilities
> > >
> > > This essentially pushes some of the permission checks into a fs layer.
> > So
> > > whoever has a file descriptor (via unix sock or open) can do BPF
> > operations
> > > on the object that represents it.
> 
> > cap_bpf doesn't change things in that regard.
> > Two cases here:
> > sysctl_unprivileged_bpf_disabled==0:
> >    Unpriv can load socket_filter prog type and unpriv can attach it
> >    via SO_ATTACH_BPF/SO_ATTACH_REUSEPORT_EBPF.
> > sysctl_unprivileged_bpf_disabled==1:
> >    cap_sys_admin can load socket_filter and unpriv can attach it.
> Sorry, I wasn't clear enough, I was talking about unpriv_bpf_disabled=1
> case.
> 
> > With addition of cap_bpf in the second case cap_bpf process can
> > load socket_filter too.
> > It doesn't mean that permissions are pushed into fs layer.
> > I'm not sure that relaxing of sysctl_unprivileged_bpf_disabled
> > will be well received.
> > Are you proposing to selectively allow certain bpf syscall commands
> > even when sysctl_unprivileged_bpf_disabled==1 ?
> > Like allow unpriv to do BPF_OBJ_GET to get an fd from bpffs ?
> > And allow unpriv to do map_update ?
> Yes, that's the gist of what I'm proposing. Allow the operations that
> work on fd even with unpriv_bpf_disabled=1. The assumption that
> obtaining fd requires a privileged operation on its own and
> should give enough protection.

I agree.

> 
> > It makes complete sense to me, but I'd like to argue about that
> > independently from this cap_bpf set.
> > We can relax that sysctl later.
> Ack, thanks, let me bring it up again later, when we get to the cap_bpf
> state.

Thanks for the feedback.
Just to make sure we're on the same page let me clarify one more thing.
The state of cap_bpf in this patch set is not the final state of bpf
security in general. We were stuck on cap_bpf proposal since september.
bpf community lost many months of what could have been gradual
improvements in bpf safety and security.
This cap_bpf is a way to get us unstuck. There will be many more
security related patches that improve safety, security and usability.