[PATCH RESEND v3 bpf-next 01/14] bpf: introduce BPF token object
Toke Høiland-Jørgensen
toke at redhat.com
Fri Jul 7 22:00:10 UTC 2023
Andrii Nakryiko <andrii.nakryiko at gmail.com> writes:
> On Fri, Jul 7, 2023 at 6:04 AM Toke Høiland-Jørgensen <toke at redhat.com> wrote:
>>
>> Andrii Nakryiko <andrii.nakryiko at gmail.com> writes:
>>
>> > On Thu, Jul 6, 2023 at 4:32 AM Toke Høiland-Jørgensen <toke at redhat.com> wrote:
>> >>
>> >> Andrii Nakryiko <andrii.nakryiko at gmail.com> writes:
>> >>
>> >> > Having it as a separate single-purpose FS seems cleaner, because we
>> >> > have use cases where we'd have one BPF FS instance created for a
>> >> > container by our container manager, and then exposing a few separate
>> >> > tokens with different sets of allowed functionality. E.g., one for
>> >> > main intended workload, another for some BPF-based observability
>> >> > tools, maybe yet another for more heavy-weight tools like bpftrace for
>> >> > extra debugging. In the debugging case our container infrastructure
>> >> > will be "evacuating" any other workloads on the same host to avoid
>> >> > unnecessary consequences. The point is to not disturb
>> >> > workload-under-human-debugging as much as possible, so we'd like to
>> >> > keep userns intact, which is why mounting extra (more permissive) BPF
>> >> > token inside already running containers is an important consideration.
>> >>
>> >> This example (as well as Yafang's in the sibling subthread) makes it
>> >> even more apparent to me that it would be better with a model where the
>> >> userspace policy daemon can just make decisions on each call directly,
>> >> instead of mucking about with different tokens with different embedded
>> >> permissions. Why not go that route (see my other reply for details on
>> >> what I mean)?
>> >
>> > I don't know how you arrived at this conclusion,
>>
>> Because it makes it apparent that you're basically building a policy
>> engine in the kernel with this...
>
> I disagree that this is a policy engine in the kernel. It's a building
> block for delegation and enforcement. The policy itself is implemented
> in user-space by a privileged process that decides when to issue BPF
> tokens and of which configuration. And, optionally and if necessary,
> further restricting using BPF LSM in a more fine-grained and dynamic
> way.
Right, and I'm saying that it's too coarse-grained to be a proper
building block in its own right. As evidenced by the need for adding an
LSM on top to do anything fine-grained; a task which is decidedly
non-trivial to get right, BTW. Which means that the path of least
resistance is going to be to just grant a token and not bother with the
LSM, thus ending up with this being a giant foot gun from a security
PoV.
>> > but we've debated BPF proxying and separate service at length, there
>> > is no point in going on another round here.
>>
>> You had some objections to explicit proxying via RPC calls; I suggested
>> a way of avoiding that by keeping the kernel in the loop, which you have
>
> I thought we settled the seccomp notify proposal?
Your objection to that was that it was too much of a hack to read all
the target process memory (etc) from the policy daemon, which I
acknowledged and suggested a way of keeping the kernel in the loop so it
can take responsibility for the gnarly bits while still allowing
userspace to actually make the decision:
https://lore.kernel.org/r/87v8ezb6x5.fsf@toke.dk
(Last two paragraphs). Maybe that message just got lost somewhere on its
way to your inbox?
>> not responded to. If you're just going to go ahead with your solution
>> over any objections you could just have stated so from the beginning and
>> saved us all a lot of time :/
>
> It would also be good to understand that yours is but one of the
> opinions. If you read the thread carefully you'll see that other
> people have differing opinions. And yours doesn't necessarily have to
> be the deciding one.
>
> I appreciate the feedback, but I don't appreciate the expectation that
> your feedback is binding in any way.
I'm not expecting veto rights, I'm objecting to being ignored. The way
this development process is *supposed* to work (as far as I'm concerned)
is that someone proposes a patch series, the community provides
feedback, and discussion proceeds until there's at least rough consensus
that the solution we've arrived at is the right way forward.
If you're going to cut that process short and just pick and choose which
comments are worth addressing and which are not, I can't stop you,
obviously; but at least do me the favour of being up front about it so I
can stop wasting my time trying to be constructive.
Anyhow, I guess this point is moot for this discussion since I'm about
to leave for vacation for four weeks and won't be able to follow up on
this. Apologies for the bad timing :/ I'll ping some RH folks and try to
get them to keep an eye on this while I'm away...
>> Can we at least put this thing behind a kconfig option, so we can turn
>> it off in distro kernels?
>
> Why can't distro disable this in some more dynamic way, though? With
> existing LSM mechanism, sysctl, whatever? I think it would be useful
> to let users have control over this and decide for themselves without
> having to rebuild a custom kernel.
A sysctl similar to the existing one for unprivileged BPF would be fine
as well. If an LSM ends up being the only way to control it, though,
that will carry so much operational overhead for us to get to a working
state that it'll most likely be simpler to just patch it out of the
kernel.
-Toke
More information about the Linux-security-module-archive
mailing list