[PATCH v2 bpf-next 00/18] BPF token

Hao Luo haoluo at google.com
Tue Jun 13 21:48:27 UTC 2023


On Mon, Jun 12, 2023 at 3:08 PM Andrii Nakryiko
<andrii.nakryiko at gmail.com> wrote:
>
> On Mon, Jun 12, 2023 at 3:49 AM Toke Høiland-Jørgensen <toke at kernel.org> wrote:
> >
<...>
> > to avoid that is by baking the support into libbpf, then that can be
> > done regardless of the mechanism we choose.
> >
> > Or to put it another way: as you say it may be more *complicated* to add
> > an RPC-based path to libbpf, but it's not fundamentally impossible, it's
> > just another technical problem to be solved. And if that added
> > complexity buys us better security properties, maybe that is a good
> > trade-off. At least we shouldn't dismiss it out of hand.
>
> You are oversimplifying this. There is a huge difference between
> syscall and RPC and interfaces.
>
> The former (syscall approach) will error out only on invalid inputs
> (and highly improbable if kernel runs out of memory, which means your
> app is dead anyways). You don't code against syscall interface with
> expectation that it can fail at any point and you should be able to
> recover it.
>
> With RPC you have to bake in into your application that any RPC can
> fail transiently, for many reasons. Service could be down, restarted,
> slow, etc, etc. This changes *everything* in how you develop
> application, how you write code, how you handle errors, how you
> monitor stuff. Everything.
>
> It's impossible to just swap out syscall with RPC transparently
> without introducing horrible consequences. This is not some technical
> difficulty, it's a fundamental impedance mismatch. One of the early
> distributed systems mistakes was to pretend that remote procedure
> calls could be reliable and assume errors are rare and could be
> pretended to behave like syscalls or local in-process APIs. It has
> been recognized many times over how bad such approaches were. It's
> outside of the scope of this discussion to go into more details.
> Suffice it to say that libbpf is not going to pretend that syscall and
> some RPC are equivalent and can be interchangeable in a transparent
> way.
>
> And then, even if we were crazy enough to do the above, there is no
> way everyone will settle on one single implementation and/or RPC
> protocol and API such that libbpf could implement it in its upstream
> version. Big companies most probably will go with their own internal
> ones that would give them better integration with internal
> infrastructure, better overvability, etc. And even in open-source
> there probably won't be one single implementation everyone will be
> happy with.
>

Hello Toke and Andrii,

I agree with Andrii here. In Google, we have several years of
experience building and using BPF RPC service. We delegate BPF
operations to this service. From our experience, the RPC approach is
quite limiting and becomes impractical for many BPF use cases.

For programs that do not require much user interaction, it works just
fine. It just loads and attaches the programs, that's all. The problem
is the programs that require much user interaction, for example, the
ones doing observability, which may often read maps or poll on bpf
ringbuf. Overhead and reliability of RPC is one concern. Another
problem is the BPF operations based on mmap, for example, directly
updating/reading BPF global variables as used in skeleton. We still
haven't figured out how to fully support bpf skeleton. We also haven't
figured out how to support BPF ringbuf using RPC. There are also
problems maintaining this service to catch up with some new features
in libbpf.

Anyway, I think the syscall interface has been heavily baked in libbpf
and bpf kernel interfaces today. There are many BPF use cases where
delegating all BPF operations to a service can't work well. IMHO, to
achieve a good balance between flexibility and security, some
abstraction that conveys controlled trust from priv to unpriv is
necessary. The idea of BPF token makes sense to me. With token, libbpf
interface requires only minimal change, unpriv user can call libbpf
and bpf syscall natively, wins on efficiency and less maintenance
burden for libbpf developers.

Thanks,
Hao



More information about the Linux-security-module-archive mailing list