[PATCH v2 bpf-next 0/9] add new acquire/release BPF kfuncs

Mon Mar 18 13:14:43 UTC 2024

On Wed, Mar 13, 2024 at 02:05:13PM -0700, Alexei Starovoitov wrote:
> On Mon, Mar 11, 2024 at 5:01 AM Christian Brauner <brauner at kernel.org> wrote:
> >
> > > > > One can argue that get_mm_exe_file() is not exported,
> > > > > but it's nothing but rcu_lock-wrap plus get_file_rcu()
> > > > > which is EXPORT_SYMBOL.
> > > >
> > > > Oh, good spot. That's an accident. get_file_rcu() definitely shouldn't
> > > > be exported. So that'll be removed asap.
> > >
> > > So, just to make a point that
> > > "Included in that set are functions that aren't currently even
> > > exported to modules"
> > > you want to un-export get_file_rcu() ?
> >
> > No. The reason it was exported was because of the drm subsystem and we
> > already quite disliked that. But it turned out that's not needed so in
> > commit 61d4fb0b349e ("file, i915: fix file reference for
> > mmap_singleton()") they were moved away from this helper.
> 
> Arguably that commit 61d4fb0b349e should have had
> Fixes: 0ede61d8589c ("file: convert to SLAB_TYPESAFE_BY_RCU")
> i915 was buggy before you touched it
> and safe_by_rcu exposed the bug.
> I can see why you guys looked at it, saw issues,
> and decided to look away.
> Though your guess in commit 61d4fb0b349e
> "
>     Now, there might be delays until
>     file->f_op->release::singleton_release() is called and
>     i915->gem.mmap_singleton is set to NULL.
> "
> feels unlikely.
> I suspect release() delay cannot be that long to cause rcu stall.
> In the log prior to the splat there are just two mmap related calls
> from selftests in i915_gem_mman_live_selftests():
> i915: Running i915_gem_mman_live_selftests/igt_mmap_offset_exhaustion
> i915: Running i915_gem_mman_live_selftests/igt_mmap
> 1st mmap test passed, but 2nd failed.
> So it looks like it's not a race, but an issue with cleanup in that driver.
> And instead of getting to the bottom of the issue
> you've decided to paper over with get_file_active().
> I agree with that trade-off.
> But the bug in i915 is still there and it's probably an UAF.
> get_file_active() is probably operating on a broken 'struct file'
> that got to zero, but somehow it still around
> or it's just a garbage memory and file->f_count
> just happened to be zero.
> 
> My point is that it's not ok to have such double standards.
> On one side you're arguing that we shouldn't introduce kfunc:
> +__bpf_kfunc struct file *bpf_get_task_exe_file(struct task_struct *task)
> +{
> + return get_task_exe_file(task);
> +}
> that cleanly takes ref cnt on task->mm->exe_file and _not_ using lower
> level get_file/get_file_rcu/get_file_active api-s directly which
> are certainly problematic to expose anywhere, since safe_by_rcu
> protocol is delicate.
> 
> But on the other side there is buggy i915 that does
> questionable dance with get_file_active().
> It's EXPORT_SYMBOL_GPL as well and out of tree driver can
> ruin safe_by_rcu file properties with hard to debug consequences.

You're lending strong support for my earlier point. Because this is a
clear an example where a subsystem got access to a helper that it
shouldn't have had access to. So we fixed the issue.

But this whole polemic just illustrates that you simply didn't bother to
understand how the code works. The way you talk about UAF together with
SLAB_TYPESAFE_BY_RCU is telling. Please read the code instead of
guessing.

So the same way we don't have to take responsibility for bpf
misunderstanding the expectations of d_path() we don't have to take
responsibility for misusing an internal helper by another subsystem.

So your argument here is moot at best and polemical and opportunistic at
worst. It certainly doesn't illustrate what you think it does.

And the above is fundamentally a suspiciously long sideshow. So let's
get back to the core topic: Unless you document your rules when it's ok
for a bpf program to come along and request access to internal apis
patchsets such as this are not acceptable.

> 
> > There is absolutely no way that any userspace will
> > get access to such low-level helpers. They have zero business to be
> > involved in the lifetimes of objects on this level just as no module has.
> 
> correct, and kfuncs do not give bpf prog to do direct get_file*() access
> because we saw how tricky safe_by_rcu is.
> Hence kfuncs acquire file via get_task_exe_file or get_mm_exe_file
> and release via fput.
> That's the same pattern that security/tomoyo/util.c is doing:
>    exe_file = get_mm_exe_file(mm);
>    if (!exe_file)
>         return NULL;
> 
>    cp = tomoyo_realpath_from_path(&exe_file->f_path);
>    fput(exe_file);
> 
> in bpf_lsm case it will be:
> 
>    exe_file = bpf_get_mm_exe_file(mm);
>    if (!exe_file)
>    // the verifier will enforce that bpf prog has this NULL check here
>    // because we annotate kfunc as:
> BTF_ID_FLAGS(func, bpf_get_mm_exe_file, KF_ACQUIRE | KF_TRUSTED_ARGS |
> KF_RET_NULL)
> 
>  bpf_path_d_path(&exe_file->f_path, ...);
>  bpf_put_file(exe_file);
> // and the verifier will enforce that bpf_put_file() is called too.
> // and there is no path out of this bpf program that can take file refcnt
> // without releasing.
> 
> So really these kfuncs are a nop from vfs pov.
> If there is a bug in the verifier we will debug it and we will fix it.
> 
> You keep saying that bpf_d_path() is a mess.
> Right. It is a mess now and we're fixing it.
> When it was introduced 4 years ago it was safe at that time.

Uhm, no it was always sketchy.

> The unrelated verifier "smartness" made it possible to use it in UAF.
> We found the issue now and we're fixing it.
> Over these years we didn't ask vfs folks to help fix such bugs,
> and not asking for help now.
> You're being cc-ed on the patches to be aware on how we plan to fix
> this bpf_d_path() mess. If you have a viable alternative please suggest.

The fix is to export a variant with trusted args.

> As it stands the new kfuncs are clean and safe way to solve this mess.

I will remind you of what you have been told in [1]:

"No. It's not up to maintainers to suggest alternatives. Sometimes it's
simply enough to explain *why* something isn't acceptable.

A plain "no" without explanation isn't sufficient. NAKs need a good
reason. But they don't need more than that.

The onus of coming up with an acceptable solution is on the person who
needs something new."

You've been provided:

a) good reasons why the patchset in it's current form isn't acceptable
   repeated multiple times
b) support for exporting a variant of bpf_d_path() that is safe to use
c) a request that all kfunc exports for the vfs will have to be located
   under fs/, not in kernel/bpf/
d) a path on how to move forward with additional kfunc requests:
   Clear and documented rules when it's ok for someone to come along and
   request access to bpf kfuncs when it's to be rejected and when it's
   ok to be supported.

You repeatedly threatening to go over the heads of people will not make
them more amenable to happily integrate with your subsystem.

[1]: https://lore.kernel.org/all/CAHk-=whD2HMe4ja5nR6WWofUh3nLmhjoSPDvZm2-XMGjeie5Tg@mail.gmail.com