BPF LSM and fexit [was: [PATCH bpf-next v3 04/10] bpf: lsm: Add mutable hooks list for the BPF LSM]

Tue Feb 11 20:33:49 UTC 2020

()On Tue, Feb 11, 2020 at 9:10 PM Alexei Starovoitov
<alexei.starovoitov at gmail.com> wrote:
> On Tue, Feb 11, 2020 at 08:36:18PM +0100, Jann Horn wrote:
> > On Tue, Feb 11, 2020 at 8:09 PM Alexei Starovoitov
> > <alexei.starovoitov at gmail.com> wrote:
> > > On Tue, Feb 11, 2020 at 07:44:05PM +0100, Jann Horn wrote:
> > > > On Tue, Feb 11, 2020 at 6:58 PM Alexei Starovoitov
> > > > <alexei.starovoitov at gmail.com> wrote:
> > > > > On Tue, Feb 11, 2020 at 01:43:34PM +0100, KP Singh wrote:
> > > > [...]
> > > > > > * When using the semantic provided by fexit, the BPF LSM program will
> > > > > >   always be executed and will be able to override / clobber the
> > > > > >   decision of LSMs which appear before it in the ordered list. This
> > > > > >   semantic is very different from what we currently have (i.e. the BPF
> > > > > >   LSM hook is only called if all the other LSMs allow the action) and
> > > > > >   seems to be bypassing the LSM framework.
> > > > >
> > > > > It that's a concern it's trivial to add 'if (RC == 0)' check to fexit
> > > > > trampoline generator specific to lsm progs.
> > > > [...]
> > > > > Using fexit mechanism and bpf_sk_storage generalization is
> > > > > all that is needed. None of it should touch security/*.
> > > >
> > > > If I understand your suggestion correctly, that seems like a terrible
> > > > idea to me from the perspective of inspectability and debuggability.
> > > > If at runtime, a function can branch off elsewhere to modify its
> > > > decision, I want to see that in the source code. If someone e.g.
> > > > changes the parameters or the locking rules around a security hook,
> > > > how are they supposed to understand the implications if that happens
> > > > through some magic fexit trampoline that is injected at runtime?
> > >
> > > I'm not following the concern. There is error injection facility that is
> > > heavily used with and without bpf. In this case there is really no difference
> > > whether trampoline is used with direct call or indirect callback via function
> > > pointer. Both will jump to bpf prog. The _source code_ of bpf program will
> > > _always_ be available for humans to examine via "bpftool prog dump" since BTF
> > > is required. So from inspectability and debuggability point of view lsm+bpf
> > > stuff is way more visible than any builtin LSM. At any time people will be able
> > > to see what exactly is running on the system. Assuming folks can read C code.
> >
> > You said that you want to use fexit without touching security/, which
> > AFAIU means that the branch from security_*() to the BPF LSM will be
> > invisible in the *kernel's* source code unless the reader already
> > knows about the BPF LSM. But maybe I'm just misunderstanding your
> > idea.
> >
> > If a random developer is trying to change the locking rules around
> > security_blah(), and wants to e.g. figure out whether it's okay to
> > call that thing with a spinlock held, or whether one of the arguments
> > is actually used, or stuff like that, the obvious way to verify that
> > is to follow all the direct and indirect calls made from
> > security_blah(). It's tedious, but it works, unless something is
> > hooked up to it in a way that is visible in no way in the source code.
> >
> > I agree that the way in which the call happens behind the scenes
> > doesn't matter all that much - I don't really care all that much
> > whether it's an indirect call, a runtime-patched direct call in inline
> > assembly, or an fexit hook. What I do care about is that someone
> > reading through any affected function can immediately see that the
> > branch exists - in other words, ideally, I'd like it to be something
> > happening in the method body, but if you think that's unacceptable, I
> > think there should at least be a function attribute that makes it very
> > clear what's going on.
>
> Got it. Then let's whitelist them ?
> All error injection points are marked with ALLOW_ERROR_INJECTION().
> We can do something similar here, but let's do it via BTF and avoid
> abusing yet another elf section for this mark.
> I think BTF_TYPE_EMIT() should work. Just need to pick explicit enough
> name and extensive comment about what is going on.

Sounds reasonable to me. :)

> Locking rules and cleanup around security_blah() shouldn't change though.
> Like security_task_alloc() should be paired with security_task_free().
> And so on. With bpf_sk_storage like logic the alloc/free of scratch
> space will be similar to the way socket and bpf progs deal with it.
>
> Some of the lsm hooks are in critical path. Like security_socket_sendmsg().
> retpoline hurts. If we go with indirect calls right now it will be harder to
> optimize later. It took us long time to come up with bpf trampoline and build
> bpf dispatcher on top of it to remove single indirect call from XDP runtime.
> For bpf+lsm would be good to avoid it from the start.

Just out of curiosity: Are fexit hooks really much cheaper than indirect calls?

AFAIK ftrace on x86-64 replaces the return pointer for fexit
instrumentation (see prepare_ftrace_return()). So when the function
returns, there is one return misprediction for branching into
return_to_handler(), and then the processor's internal return stack
will probably be misaligned so that after ftrace_return_to_handler()
is done running, all the following returns will also be mispredicted.

So I would've thought that fexit hooks would have at least roughly the
same impact as indirect calls - indirect calls via retpoline do one
mispredicted branch, fexit hooks do at least two AFAICS. But I guess
indirect calls could still be slower if fexit benefits from having all
the mispredicted pointers stored on the cache-hot stack while the
indirect branch target is too infrequently accessed to be in L1D, or
something like that?